WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCX 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification ^ : 

C12N 15/85, 9/10, C07K 16/30, 16/40, 
C12Q 1/68, AOIK 67/027 



A2 



(11) International Publication Number: 
(43) International Publication Date: 



WO 99/32644 

1 July 1999 (01.07.99) 



(21) International ^Application Number: PCT/IB98/02 1 33 

(22) International Filing Date: 22 December 1998 (22.12.98) 



(30) Priority Data: 

08/996,306 
60/099.658 



22 December 1 997 (22. 1 2.97) US 
9 September 1998 (09.09.98) US 



(71) Applicant (for all designated States except VS)x 
[FR/FR]; 24, rue Royale, F-75008 Paris (PR). 



GENSET 



(72) Inventors; and 

(75) Inventors/Applicants {for US only)t COHEN, Daniel [FR/FR]; 
5, avenue Odette, F-94210 Fontenay-sous-Bois (FR). 
BLUMENFELD, Marta [FR/FR]; 5, rue Tagore, F-75013 
Paris (FR). CHUMAKOV, Ilya [FR/FR]; 196, rue des 
ChfevrefeuiUes, F-77000 Vaux-le-P6nil (FR). BOUGUEL- 
ERET, Lydie [FR/FR]; 108, avenue Victor-Hugo, F-92170 
Vanves (FR). 

(74) Agents: MARTIN, Jean- Jacques et al.; Cabinet Regimbeau, 
26, avenue K16ber, F-75 1 1 6 Paris (FR). 



(81) Designated States: AL, AM, AT, AU, AZ, BA. BB, BG. BR, 
BY, CA, CH, CN, CU, CZ, DE, DK. EE, ES, FI, GB, GD, 
GE, GH, GM, HR, HU. ID, IL, IN, IS. JP, KE, KG, KP, 
KR, KZ. LC» LK, LR, LS, LT, LU, LV, MD, MG, MK, 
MN. MW, MX, NO, NZ, PL, PT, RO, RU, SD, SE, SG, 
SI, SK, SL, TJ, TM, TR, TT, UA, UG, US, UZ, VN, YU, 
ZW, ARIPO patent (GH, GM, KE, LS, MW, SD, SZ, Ug! 
ZW). Eurasian patent (AM, AZ, BY, KG, KZ, MD, RU, TJ, 
TM), European patent (AT, BE, CH, CY, DE, DK, ES, FI, 
FR, GB, GR, IE, IT, LU, MC, NL, PT, SE), OAPI patent 
(BF, BJ, CF, CG, CI, CM, GA, GN, GW, ML, MR, NE, 
SN, TD, Tfj). 



Published 

Without international search report and to be republished 
upon receipt of that report. 



(54) Title: PROSTATE CANCER GENE 



(57) Abstract 



The present Invention relates to PGl, a gene associated with prostate cancer. The invention provides polynucleotides including 
biallelic markers derived from PGl and from flanking genomic regions. Primers hybridizing to these biallelic markers and regions flanking 
are also provided. Tliis invention provides polynucleotides and methods suitable for genptyping a nucleic acid containing sample for one or 
more biallelic markers of the invention. Further, the invention provides methods to detect a statistical conflation between a biallelic marker 
allele and prostate cancer and between a haplotypc and prostate cancer. The invention also relates to diagnostic methods of determining 
whether an individual is at risk for developing prostate cancer, and whether an individual suffers from prostate concer as a result of a 
mutation in the PGl gene. 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Armenia 


FI 


Fmland 


LT 


Lithuania 


SK 


Slovakia 


AT 


Austria 


PR 


France 


LU 


Luxembourg 


SN 


Senegal 


Al) 


Australia 


GA 


Gabon 


LV 


Latvia 


sz 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Heizegovina 


GE 


Georgia 


MD 


Republic of MoWova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Guinea 


MK 


The former Yugoslav 


TM 


Turkmenistan 


BF 


Burkina Faso 


GR 


Greece 




Republic of Macedonia 


TR 


T\jrkey 


BG 


Bulgaria 


HU 


Hungaiy 


ML 


Mali 


TT 


Trinidad and T<rfjago 


BJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Malawi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


uz 


Uzbekistan 


CF 


Central African Republic 


JP 


J^an 


N£ 


Niger 


VN 


Viet Nam 


CG 


Congo 


KE 


Kenya 


NL 


Netherlands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


. Kyrgyzslan 


NO 


Norway 


zw 


Zimbabwe 


CI 


C6te d'lvoire 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


PT 


Portugal 






CU 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






CZ 


Czech Republic 


LC 


Saint Lucia 


RU 


Russian Federation 






DE 


Germany 


LI 


Liechtenstein 


SD 


Sudan 






DK 


Denmark 


LK 


Sri Lanka 


SE 


Sweden 






EE 


Estonia 


LR 


Liberia 


SG 


Singapore 







wo 99/32644 



1 



PCT/IB98/02133 



PROSTATE CANCER GENE 

p^ickpmiind of the Invention 
5 A cancer is a clonal proliferation of cells produced as a consequence of cumulative genetic 

damage that finally results in unrestrained cell growth, tissue invasion and metastasis (cell 
transformation). Regardless of the type of cancer, transfornwd colls carry damaged DNA in many 
forms: as gross chromosomal translocations or, more subtly, as DNA amplification, rearrangement or 
even point mutations. 

JO Sotne oncogenic mutations is inherited in the germline, thus predisposing the mutation carrier 

to an increased risk of cancer. However, in a majority of cases, cancer does not occur as a simple 
monogenic disease with clear Mendelian inheritance. There is only a two- or threefold increased risk 
of cancer among first-degree relatives for many cancers (Mulvihill JJ, Miller RW & Fraumeni JF, 
1977, Genetics of human cancer Vol 3. New York Raven Press), Alternatively, DNA damage is 
J5 acquired somatically, probably induced by exposure to environmental carcinogens. Somatic 
mutations are generally responsible for the vast majority of cancer cases. 

Studies of the age dependence of cancer have suggested that several successive mutations are 
needed to convert a normal cell into an invasive carcinoma. Since human mutation rates are typically 
lO'Vgcne/cell. the chance of a single cell undergoing many independent mutations is very low (Locb 
20 LA. Cancer Res 1 99 1 , 5 1 : 3075-3079), Cancer nevertheless happens because of a combination of two 
mechanisms. Some mutations enhance cell proliferation, increasing the target population of cells for 
the next mutation. Other mutations affect the stability of the entire genome, increasing the overall 
mutation rate, as in the case of mismatch repair proteins (reviewed in Amheim N & Shibata D, Curr, 
Op. Genetics & Development. 1997. 7:364-370). 
25 An intricate process known as the cell cycle drives normal proliferation of cells in an 

organism. Regulation of the extent of cell cycle activity and the orderly execution of sequential steps 
within the cycle ensure the normal development and homeostasis of the organism. Conversely, many 
of the properties of cancer cells - uncontrolled proliferation, increased mutation rate, abnormal 
translocations and gene amplifications - can be attributed directly to perturbations of the normal 
JO regulation or progression of the cycle. In fact, many of the genes that have been identified over the 
past several decades as being involved in cancer, can now be appreciated in terms of their direct or 
indirect role in either regulating entry into the cell cycle or coordinating events within the cell cycle. 

Recent studies have identified three groups of genes which are frequently mutated in cancer. 
The first group of genes, called oncogenes, are genes whose products activate cell proliferation. The 
35 normal non-mutant versions are called protooncogenes. The mutated forms are excessively or 
inappropriately active in promoting cell proliferation, and act in the cell in a dominant way in that a 
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single mutant allele is enough to affect the cell phenotype. Activated oncogenes are rarely transmitted 
as germline mutations since they may probably be lethal when expressed in all the cells. Therefore 
oncogenes can only be investigated in tumor tissues. 

Oncogenes and protooncogenes can be classified into several different categories according to 
5 their function. This classification includes genes that code for proteins involved in signal 
transduction such as: growth factors (i.e., sis, int-2); receptor and non-receptor protein-tyrosine 
kinases (i.e., erbB, src, bcr-abl, met, trk); membrane-associated G proteins (i.e., ras); cytoplasmic 
protein kinases (i.e., mitogen-activated protein kinase -MAPK- family, raf, mos, pak), or nuclear 
transcription factors (i.e., myc, myb, fos, jun, rel) (for review see Hunter T, 1991 Cell 64:249; Fanger 
10 GR et al., 1997 CurT.Op.Genet.Dev.7:67-74; Weiss FU et aLfibid. 80-86). 

The second group of genes which are frequently mutated in cancer, called tumor suppressor 
genes, are genes whose products inhibit cell growth. Mutant versions in cancer cells have lost their 
normal function, and act in the cell in a recessive way in that both copies of the gene must be 
inactivated in order to change the cell phenotype. Most importantly, the tumor phenotype can be 
15 rescued by the wild type allele, as shown by cell fusion experiments first described by Harris and 
colleagues (Harris H et al.,1969J^ature 223:363-368). Germline mutations of tumor suppressor genes 
is transmitted and thus studied in both constitutional and tumor DNA from familial or sporadic cases. 
The current family of tumor suppressors includes DNA-binding transcription factors (i.e., p53, WTl), 
transcription regulators (i.e., RB, APC, probably BRCAl), protein kinase inhibitors (i.e., pl6), among 
20 others (for review, see Haber D & Harlow E, 1997, Nature Genet. 16:320-322). 

The third group of genes which are frequently mutated in cancer, called mutator genes, are 
responsible for maintaining genome integrity and/or low mutation rates. Loss of function of both 
alleles increase cell mutation rates, and as consequence, proto-oncogenes and tumor suppressor genes 
is mutated. Mutator genes can also be classified as tumor suppressor genes, except for the fact that 
25 tumorigenesis caused by this class of genes cannot be suppressed simply by restoration of a wild-type 
allele, as described above. Genes whose inactivation may lead to a mutator phenotype include 
mismatch repair genes (i.e., MLHl, MSH2). DNA helicases (i.e., ELM, WRN) or other genes 
involved in DNA repair and genomic stability (i.e., p53, possibly BRCAl and BRCA2) (For review 
see Haber D & Harlow E, 1997, Nature Genet. 16:320-322; Fishel R & Wilson T. 1997, 
30 Curr.Op.Genet.Dev.7: 105-1 13; Ellis NA,1997 ibid. 354-363). 

The recent development of sophisticated techniques for genetic mapping has resulted in an 
ever expanding list of genes associated with particular types of human cancers. The human haploid 
genome contains an estimated 80,000 to 100,000 genes scattered on a 3 x lO' base-long double- 
stranded DNA. Each human being is diploid, i.e., possesses two haploid genomes, one from paternal 
35 origin, the other from maternal origin. The sequence of a given genetic locus may vary between 
individuals in a population or between the two copies of the locus on the chromosomes of a single 



wo 99/32644 PCT/IB98/02133 

3 

individual. Genetic mapping techniques often exploit these differences, which are called 
polymorphisms, to map the location of genes associated with human phenotypes. 

One mapping technique, called the loss of heterozygosity (LOH) technique, is often employed 
to detect genes in which a loss of function results in a cancer, such as the tumor suppressor genes 
5 described above. Tumor suppressor genes often produce cancer via a two hit mechanism in which a 
first mutation, such as a point mutation (or a small deletion or insertion) inactivates one allele of the 
tumor suppressor gene. Often, this first mutation is inherited from generation to generation. 

A second mutation, often a spontaneous somatic mutation such as a deletion which deletes all 
or part of the chromosome carrying the other copy of the tumor suppressor gene, results in a cell in 
10 which both copies of the tumor suppressor gene are inactive. 

As a consequence of the deletion in the tumor suppressor gene, one allele is lost for any 
genetic marker located close to the tumor suppressor gene. Thus, if the patient is heterozygous for a 
marker, the tumor tissue loses heterozygosity, becoming homozygous or hemizygous. This loss of 
heterozygosity generally provides strong evidence for the existence of a tumor suppressor gene in the 
75 lost region. 

By genotyping pairs of blood and tumor samples from affected individuals with a set of highly 
polymorphic genetic markers, such as microsatellites. covering the whole genome, one can discover 
candidate locations for tumor suppressor genes. Due to the presence of contaminant non-tumor tissue 
in most pathological tumor samples, a decreased relative intensity rather than total loss of 
20 heterozygosity of informative microsatellites is observed in the tumor samples. Therefore, classic 
LOH analysis generally requires quantitative PGR analysis, often limiting the power of detection of 
this technique. Another limitation of LOH studies resides on the fact that they only allow the 
definition of rather large candidate regions, typically spanning over several megabases. Refinement 
of such candidate regions requires the defmition of the minimally overlapping portion of LOH regions 
25 identified in tumor tissues from several hundreds of affected patients. 

Another approach to genetic mapping, called linkage analysis, is based upon 
establishing a correlation between the transmission of genetic markers and that of a specific trait 
throughout generations within a family. In this approach, all members of a series of affected families 
are genotyped with a few hundred markers, typically microsatellite markers, which are distributed at 
30 an average density of one every 10 Mb. By comparing genotypes in all family members, one can 
attribute sets of alleles to parental haploid genomes (haplotyping or phase determination). The origin 
of recombined fragments is then determined in the offspring of all families. Those that co-segregate 
with the trait are tracked. After pooling data from all families, statistical methods are used to 
determine the likelihood that the marker and the trait are segregating independently in all families. As 
35 a result of the statistical analysis, one or several regions are selected as candidates, based on their high 
probability to carry a trait causing allele. The result of linkage analysis is considered as significant 
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when the chance of independent segregation is lower than 1 in 1000 (expressed as a LOD score > 3). 
Identification of recombinant individuals using additional markers allows further delineation of the 
candidate linked region, which most usually ranges from 2 to 20 Mb. 

Linkage analysis studies have generally relied on the use of raicrosatellite markers (also 
5 called simple tandem repeat polymorphisms, or simple sequence length polymorphisms). These 
include small arrays of tandem repeats of simple sequences (di- tri- tetra- nucleotide repeats), which 
exhibit a high degree of length polymorphism, and thus a high level of informativeness. To date, only 
just more than 5,000 microsatellites have been ordered along the human genome (Dib et al., Nature 
1996. 380: 152), thus limiting the maximum attainable resolution of linkage analysis to ca. 600 kb on 
10 average. 

Linkage analysis has been successfully applied to map simple genetic traits that show clear 
Mendelian inheritance patterns. About 100 pathological trait-causing genes were discovered by 
linkage analysis over the last 10 years. 

However, linkage analysis approaches have proven difficult for complex genetic traits, those 

15 probably due to the combined action of multiple genes and/or environmental factors. In such cases, 
too large an effort and cost are needed to recruit the adequate number of affected families required for 
applying linkage analysis to these situations, as recently discussed by Risch, N. and Merikangas, K. 
(Science 1996, 273: 1516-1517). Finally, linkage analysis cannot be applied to the study of traits for 
which no available large informative families are available. Typically, this will be the case in any 

20 attempt to identify trait-causing alleles involved in sporadic cases. 

The incidence of prostate cancer has dramatically increased over the last decades. It averages 
30-50/100,000 males both in Western European countries as well as within the US White male 
population. In these countries, it has recently become the most commonly diagnosed malignancy, 
being one of every four cancers diagnosed in American males. Prostate cancer's incidence is very 

25 much population specific, since it varies from 2/100,000 in China, to over 80/100,000 among African- 
American males. 

In France, the incidence of prostate cancer is 35/100,000 males and it is increasing by 
10/100.000 per decade. Mortality due to prostate cancer is also growing accordingly. It is the second 
cause of cancer death among French males, and the first one among French males aged over 70, This 
30 makes prostate cancer a serious burden in terms of public health, especially in view of the aging of 
populations. 

An average 40% reduction in life expectancy affects males with prostate cancer. If 
completely localized, prostate cancer can be cured by surgery, with however an average success rate 
of only ca. 50%. If diagnosed after metastasis from the prostate, prostate cancer is a fatal disease for 
35 which there is no curative treatment. 

Early-stage diagnosis relies on Prostate Specific Antigen (PSA) dosage, and would allow the 
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detection of prostate cancer seven years before clinical symptoms become apparent. The 
effectiveness of PSA dosage diagnosis is however limited, due to its inability to discriminate between 
malignant and non-malignant affections of the organ. 

Therefore, there is a strong need for both a reliable diagnostic procedure which would enable 
5 early-sUge prostate cancer prognosis, and for preventive and curative treatments of the disease. The 
present invention relates to the PGl gene, a gene associated with prostate cancer, as well as 
diagnostic methods and reagents for detecting alleles of the gene which may cause prostate cancer, 
and therapies for treating prostate cancer. 

10 Sunmiarv of the Invention 

The present invention relates to the identification of a gene associated with prostate cancer, 
identified as the PGl gene, and reagents, diagnostics, and therapies related thereto. The present 
invention is also based on the discovery of a novel set of PGl-related biallelic markers. See the 
definition of PGl -related biallelic markers in the Detailed Description Section. These markers are 
15 located in the coding regions as well as non-coding regions adjacent to the PGl gene. The position of 
these markers and knowledge of the surrounding sequence has been used to design polynucleotide 
compositions which are useful in determining the identity of nucleotides at the marker position, as 
well as more complex association and haplotyping studies which are useful in determining the genetic 
basis for diseases including cancer and prostate cancer. In addition, the compositions and methods of 
20 the invention find use in the identification of the targets for the development of pharmaceutical agents 
and diagnostic methods, as well as the characterization of the differential efficacious responses to and 
side effects from pharmaceutical agents acting on diseases including cancer and prostate cancer. 

A first embodiment of the invention is a recombinant, purified or isolated polynucleotide 
comprising, or consisting of a mammalian genomic sequence, gene, or fragments thereof. In one 
25 aspect the sequence is derived from a human, mouse or other mammal. In a preferred aspect, the 
genomic sequence is the human genomic sequence of SEQ ID NO: 179 or the complement thereto. In 
a second preferred aspect, the genomic sequence is selected from one of the two mouse genomic 
fragments of SEQ ID NO: 182 and 183. In yet another aspect of this embodiment, the nucleic acid 
comprises nucleotides 1629 through 1870 of the sequence of SEQ ID NO: 179. Optionally, said 
30 polynucleotide consists of, consists essentially of, or comprises a contiguous span of nucleotides of a 
mammalian genomic sequence, preferably a sequence selected the following SEQ ID NOs: 179, 182, 
and 183, wherein said contiguous span is at least 6. 8. 10, 12, 15. 20, 25, 30, 50, 100, 200. or 500 
nucleotides in length. 

A second embodiment of the present invention is a recombinant, purified or isolated 
35 polynucleotide comprising, or consisting of a mammalian cDNA sequence, or fragments thereof. In 
one aspect the sequence is derived from a human, mouse or other mammal. In a preferred aspect, the 
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cDNA sequence is selected from the human cDNA sequences of SEQ ID NO: 3. 69, 112-124 or the 
complement thereto. In a second preferred aspect, the cDNA sequence is the mouse cDNA sequence 
of SEQ ID NO: 184. Optionally, said polynucleotide consists of, consists essentially of, or comprises 
a contiguous span of nucleotides of a mammalian genomic sequence, preferably a sequence selected 

5 the following SEQ ID NOs: 3, 69, 112-124 and 184, wherein said contiguous span is at least 6, 8, 10, 
12, 15, 20, 25, 30, 50, 100, 200, or 500 nucleotides in length. 

A third embodiment of the present invention is a recombinant, purified or isolated 
polynucleotide, or the complement thereof, encoding a mammalian PGl protein, or a fragment 
thereof. In one aspect the PGl protein sequence is from a human, mouse or other mammal. In a 

10 preferred aspect, the PGl protein sequence is selected from the human PGl protein sequences of SEQ 
ID NO: 4, 5, 70, and 125-136. In a second preferred aspect, the PGl protein sequence is the mouse 
PGl protein sequences of SEQ ID NO: 74. Optionally, said fragment of PGl polypeptide consists of. 
consists essentially of, or comprises a contiguous stretch of at least 8, 10, 12, 15, 20, 25, 30, 50, 100 
or 200 amino acids from SEQ ID NOs: 4, 5, 70. 74, and 125-136, as well as any other human, mouse 

15 or mammalian PGl polypeptide. 

A fourth embodiment of the invention are the polynucleotide primers and probes disclosed 

herein 

A fifth embodiment of the present invention is a recombinant, purified or isolated polypeptide 
comprising or consisting of a mammalian PGl protein, or a fragment thereof. In one aspect the PGl 

20 protein sequence is from a human, mouse or other mammal. In a preferred aspect, the PGl protein 
sequence is selected from the human PGl protein sequences of SEQ ID NO: 4, 5, 70, and 125-136. In 
a second preferred aspect, the PGl protein sequence is the mouse PGl protein sequences of SEQ ID 
NO: 74, Optionally, said fragment of PGl polypeptide consists of, consists essentially of, or 
comprises a contiguous stretch of at least 8. 10, 12, 15, 20, 25, 30, 50, 100 or 200 amino acids from 

25 SEQ ID NOs: 4, 5, 70, 74, and 125-136, as well as any other human, mouse or manmialian PGl 
polypeptide. 

A sixth embodiment of the present invention is an antibody composition capable of 
specifically binding to a polypeptide of the invention. Optionally, said antibody is polyclonal or 
monoclonal. Optionally, said polypeptide is an epitope-containing fragment of at least 8, 10, 12. 15. 

30 20, 25, or 30 amino acids of a human, mouse, or mammalian PGl protein, preferably a sequence selected 
from SEQ ID NOs: 4, 5, 70, 74, or 125-136. 

A seventh embodiment of the present invention is a vector comprising any polynucleotide of the 
invention. Optionally, said vector is an expression vector, gene therapy vector, amplification vector, 
gene targeting vector, or knock-out vector, 

35 An eighth embodiment of the present invention is a host cell comprising any vector of the 

invention. 
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A ninth embodiment of the present invention is a raanimalian host cell comprising a PGl gene 
disrupted by homologous recombination with a knock out vector. 

A tenth embodiment of the present invention is a nonhuman host mammal or animal 
comprising a vector of the invention. 
J A further embodiment of the present invention is a nonhuman host mammal comprising a 

PGl gene disrupted by homologous recombination with a knock out vector. 

Another embodiment of the present invention is a method of determining whether an 
individual is at risk of developing cancer or prostate cancer at a later date or whether the individual 
suffers from cancer or prostate cancer as a result of a mutation in the PGl gene comprising obtaining 
10 a nucleic acid sample from the individual; and determiriing whether the nucleotides present at one or 
more of the PGl-related biallelic markers of the invention are indicative of a risk of developing 
prostate cancer at a later date or indicative of prostate cancer resulting from a mutation in the PGl 
gene. Optionally, said PGl-related biallelic is a PGl-related biallelic markers positioned in SEQ ID 
NO: 179; a PGl-related biallehc marker selected from the group consisting of 99-1485/251, 99- 
15 622/95, 99-619/141, 4-76/222, 4-77/151, 4-71/233, A-lVUl, 4-73/134, 99-610/250, 99-609/225, 4- 
90/283, 99-602/258, 99-600/492, 99-598/130. 99-217/277, 99-576/421, 4-61/269, 4-66/145, and 4- 
67/40; or a PGl-related biallelic marker selected from the group consisting of 99-622, 4-77, 4-71, 4- 
73,99-598, 99-576, and 4-66. 

Another embodiment of the present invention is a method of determining whether an 
20 individual is at risk of developing prostate cancer at a later date or whether the individual suffers from 
prostate cancer as a result of a mutation in the PGl gene comprising obtaining a nucleic acid sample 
from the individual and determining whether the nucleotides present at one or more of the 
polymorphic bases in a PGl-related biallelic marker. Optionally, said PGl-related biallelic is a PGl- 
related biallelic markers positioned in SEQ ID NO: 179; a PGl-related biallelic marker selected from 
25 the group consisting of 99-1485/25 1, 99-622/95, 99-619/141, 4-76/222, 4-77/15 1, 4-71/233, 4-72/127, 
4-73/134, 99-610/250, 99-609/225, 4-90/283, 99-602/258, 99-600/492, 99-598/130, 99-217/277, 99- 
576/421. 4-61/269, 4-66/145, and 4-67/40; or a PGl-related biallelic marker selected from the group 
consisting of 99-622, 4-77, 4-71, 4-73, 99-598, 99-576 , and 4-66. 

Another embodiment of the present invention is a method of obtaining an allele of the PGl 
30 gene which is associated with a detectable phenotype comprising obtaining a nucleic acid sample 
from an individual expressing the detectable phenotype, contacting the nucleic acid sample with an 
agent capable of specifically detecting a nucleic acid encoding the PGl protein, and isolating the 
nucleic acid'encoding the PGl protein. In one aspect of this method, the contacting step comprises 
contacting the nucleic acid sample with at least one nucleic acid probe capable of specifically 
35 hybridizing to said nucleic acid encoding the PGl protein. In another aspect of this embodiment, the 
contacting step con:^)rises contacting the nucleic acid sample with an antibody capable of specifically 
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binding to the PGl protein. In another aspect of this embodiment, the step of obtaining a nucleic acid 
sample from an individual expressing a detectable phenotype comprises obtaining a nucleic acid 
sample from an individual suffering from prostate cancer. 

Another embodiment of the present invention is a method of obtaining an allele of the PGl 
5 gene which is associated with a detectable phenotype comprising obtaining a nucleic acid sample 
from an individual expressing the detectable phenotype, contacting the nucleic acid sample with an 
agent capable of specifically detecting a sequence within the 8p23 region of the human genome, 
identifying a nucleic acid encoding the PGl protein in the nucleic acid sample, and isolating the 
nucleic acid encoding the PGl protein. In one aspect of this embodiment, the nucleic acid sample is 
10 obtained from an individual sufferirfg from cancer or prostate canceif. 

Another embodiment of the present invention is a method of categorizing the risk of prostate 
cancer in an individual comprising the step of assaying a sample taken from the individual to 
determine whether the individual carries an allelic variant of PGl associated witfi an increased risk of 
prostate cancer. In one aspect of this embodiment, the sample is a nucleic acid sample. In another 
15 aspect a nucleic acid sample is assayed by determining the frequency of the PGl transcripts present. 
In another aspect of this embodiment, the sample is a protein sample. In another aspect of this 
embodiment, the method further comprises determining whether the PGl protein in the sample binds 
an antibody specific for a PGl isoform associated with prostate cancer. 

Another embodiment of the present invention is a method of categorizing the risk of prostate 
20 cancer in an individual comprising the step of determining whether the identities of the polymorphic 
bases of one or more biallelic markers which are in linkage disequilibrium with the PGl gene are 
indicative of an increased risk of prostate cancer. 

Another embodiment of the present invention comprises a method of identifying molecules 
which specifically bind to a PGl protein, preferably the protein of SEQ ID N0:4 or a portion thereof: 
25 comprising the steps of introducing a nucleic a nucleic acid encoding the protein of SEQ ID NO:4 or a 
portion thereof into a cell such that the protein of SEQ ID N0:4 or a portion thereof contacts proteins 
expressed in the cell and identifying those proteins expressed in the cell which specifically interact 
with the protein of SEQ ID N0:4 or a portion thereof. 

Another embodiment of the present invention is a method of identifying molecules which 
30 specifically bind to the protein of SEQ ID NO: 4 or a portion thereof. One step of the method 
comprises linking a first nucleic acid encoding the protein of SEQ ID N0:4 or a portion thereof to a 
first indicator nucleic acid encoding a first mdicator polypeptide to generate a first chimeric nucleic 
acid encoding a first fusion protein. The first fusion protein comprises the protein of SEQ ID N0:4 or 
a portion thereof and the first indicator polypeptide. Another step of die method comprises linking a 
35 second nucleic acid nucleic acid encoding a test polypeptide to a second indicator nucleic acid 
encoding a second indicator polypeptide to generate a second chimeric nucleic acid encoding a second 
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fusion protein. The second fusion protein comprises the test polypeptide and the second indicator 
polypeptide. Association between the first indicator protein and the second indicator protein produces 
a detectable result. Another step of the method comprises introducing the first chimeric nucleic acid 
and the second chimeric nucleic acid into a cell. Another step comprises detecting the detectable 
5 result. 

A further embodiment of the invention is a purified or isolated mammalian PGi gene or 
cDNA sequence. 

Further embodiments of the present invention include the nucleic acid and amino acid 
sequences of mutant or low frequency PGI alleles derived from prostate cancer patients, tissues or 

10 cell lines. The present invention also encompasses methods^ which utilize detection of these mutant 
PGI sequences in an individual or tissue sample to diagnosis prostate cancer, assess the risk of 
developing prostate cancer or assess the likely severity of a particular prostate tumor. 

Another embodiment of the invention encompasses any polynucleotide of the invention 
attached to a solid support. In addition, the polynucleotides of the invention which are attached to a 

15 solid support encompass polynucleotides with any further limitation described in this disclosure, or 
those following: Optionally, said polynucleotides is specified as attached individually or in groups of 
at least 2, 5, 8, 10, 12, 15, 20, or 25 distinct polynucleotides of the inventions to a single solid support. 
Optionally, polynucleotides other than those of the invention may attached to the same solid support 
as polynucleotides of the invention. Optionally, when multiple polynucleotides are attached to a solid 

20 support they are attached at random locations, or in an ordered array. Optionally, said ordered array is 
addressable. 

An additional embodiment of the invention encompasses the use of any polynucleotide for, or 
any polynucleotide for use in, determining the identity of an allele at a PGI -related biallelic marker. 
In addition, the polynucleotides of the invention for use in determining the identity of an allele at a 

25 PGl-related biallelic marker encompass polynucleotides with any further limitation described in this 
disclosure, or those following: Optionally, said PGl-related biallelic marker is a PGl-related biallelic 
markers positioned in SEQ ID NO: 179; a PGl-related biallelic marker selected from the group 
consisting of 99-1485/251, 99-622/95, 99-619/141, 4-76/222. 4-77/151, 4-71/233, 4-72/127, 4-73/134, 
99-610/250, 99-609/225, 4-90/283, 99-602/258, 99-600/492, 99-598/130, 99-217/277, 99-576/421, 4- 

30 61/269, 4-66/145, and 4-67/40; or a PGl-related biallelic marker selected from the group consisting of 
99-622, 4-77, 4-71, 4-73, 99-598, 99-576, and 4-66. Optionally, said polynucleotide may 
comprise a sequence disclosed in the present specification. Optionally, said polynucleotide may 
consist of, or consist essentially of any polynucleotide described in the present specification. 
Optionally, said determining is performed in a hybridization assay, sequencing assay, 

35 microsequencing assay, or allele-specific amplification assay. Optionally, said polynucleotide is 
attached to a solid support, array, or addressable array. Optionally, said polynucleotide is labeled. 
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Another embodiment of the invention encompasses the use of any polynucleotide for, or any 
polynucleotide for use in, amplifying a segment of nucleotides comprising an PGl -related biallelic 
marker. In addition, the polynucleotides of the invention for use in amplifying a segment of 
nucleotides comprising a PGl -related biallelic marker encompass polynucleotides with any further 
5 limitation described in this disclosure, or those following: Optionally, said PGl-related biallelic 
marker is a PGl-related biallelic markers positioned in SEQ ID NO: 179; a PGl-related biallelic 
marker selected from the group consisting of 99-1485/251, 99-622/95, 99-619/141, 4-76/222. 4- 
77/151, 4-71/233, 4-72/127, 4-73/134, 99-610/250, 99-609/225. 4-90/283. 99-602/258, 99-600/492. 
99-598/130, 99-217/277, 99-576/421, 4-61/269, 4-66/145, and 4-67/40; or a PGl-related biallelic 

10 marker selected from the group consisting of 99-622 , 4-77 , 4-71 , 4-73 ,^99-598 , 99-576 , and 4-66. 
Optionally, said polynucleotide may comprise a sequence disclosed in the present specification. 
Optionally, said polynucleotide may consist of, or consist essentially of any polynucleotide described 
in the present specification. Optionally, said amplifying is performed by a PGR or LCR. Optionally, 
said polynucleotide is attached to a solid support, array, or addressable array. Optionally, said 

15 polynucleotide is labeled. 

A further embodiment of the invention encompasses methods of genotyping a biological 
sample comprising determining the identity of an allele at an PGl-related biallelic marker. In addition, 
the genotyping methods of the invention encompass methods with any furdier limitation described in 
this disclosure, or those following: Optionally, said PGl-related biallelic marker is a PGl-related 

20 biallelic markers positioned in SEQ ID NO: 179; a PGl-related biallelic marker selected from the 
group consisting of 99-1485/251, 99-622/95, 99-619/141, 4-76/222, 4-77/151, 4-71/233, 4-72/127, 4- 
73/134. 99-610/250, 99-609/225, 4-90/283, 99-602/258. 99-600/492, 99-598/130, 99-217/277, 99- 
576/421, 4-61/269, 4-66/145, and 4-67/40; or a PGl-related biallelic marker selected from the group 
consisting of 99-622 , 4-77 , 4-71 , 4-73 , 99-598 . 99-576 , and 4-66. Optionally, said method further 

25 comprises determining the identity of a second allele at said biallelic marker, wherein said first allele 
and second allele are not base paired (by Watson & Crick base pairing) to one another. Optionally, 
said biological sample is derived from a single individual or subject. Optionally, said method is 
performed in vitro. Optionally, said biallelic marker is determined for both copies of said biallelic 
marker present in said individual's genome. Optionally, said biological sample is derived from 

30 multiple subjects or individuals. Optionally, said method further comprises amplifying a portion of 
said sequence comprising the biallelic marker prior to said determining step. Optionally, wherein said 
amplifying is performed by PGR, LCR, or replication of a recombinant vector comprising an origin of 
replication and said portion in a host cell. Optionally, wherein said determining is performed by a 
hybridization assay, sequencing assay, microsequencing assay, or allele-specific amplification assay. 

35 An additional embodiment of the invention comprises methods of estimating the frequency of 

an allele in a population comprising determining the proportional representation of an allele at a PGl- 
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related biallelic marker in said population. In addition, the methods of estimating the frequency of an 
allele in a population of the invention encompass methods with any further limitation described in this 
disclosure, or those following: Optionally, said PGl-related biallelic marker is a PGl-related biallelic 
markers positioned in SEQ ID NO: 179; a PGl-related biallelic marker selected from the group 
5 consisting of 99-1485/251, 99-622/95, 99-619/141, 4-76/222, 4-77/151, 4-71/233, 4-72/127, 4-73/134, 
99-610/250, 99-609/225, 4-90/283, 99-602/258, 99-600/492, 99-598/130, 99-217/277, 99-576/421, 4- 
61/269, 4-66/145, and 4-67/40; or a PGl-related biallelic marker selected from the group consisting of 
99-622 . 4-77 , 4-71 , 4-73 , 99-598 , 99-576 , and 4-66. Optionally, determining the proportional 
representation of an allele at a PGl-related biallelic marker is accomplished by determining the 
10 identity of the alleles for both copie^^ of said biallelic marker present in the genome of each individual 
in said population and calculating the proportional representation of said allele at said PGl-related 
biallelic marker for the population. Optionally, determining the proportional representation is 
accomplished by performing a genotyping method of the invention on a pooled biological sample 
derived from a representative number of individuals, or each individual, in said population, and 
15 calculating the proportional amount of said nucleotide compared with the total. 

A further embodiment of the invention comprises methods of detecting an association 
between a genotype and a phenotype, comprising the steps of a) genotyping at least one PGl-related 
biallelic marker in a trait positive population according to a genotyping method of the invention; b) 
genotyping said PGl-related biallelic marker in a control population according to a genotyping 
20 method of the invention; and c) determining whether a statistically significant association exists 
between said genotype and said phenotype. In addition, the methods of detecting an association 
between a genotype and a phenotype of the invention encompass methods with any further limitation 
described in this disclosure, or those following: Optionally, said PGl-related bidleUc marker is a 
PGl-related biallelic markers positioned in SEQ ID NO: 179; a PGl-related biallelic marker selected 
25 from the group consisting of 99-1485/251, 99-622/95, 99-619/141, 4-76/222, 4-77/151, 4-71/233, 4- 
72/127, 4-73/134, 99-610/250, 99-609/225, 4-90/283, 99-602/258, 99-600/492, 99-598/130, 99- 
217/277, 99-576/421, 4-61/269, 4-66/145, and 4-67/40; or a PGl-related biallelic marker selected 
from the group consisting of 99-622 , 4-77 , 4-71 , 4-73 , 99-598 , 99-576 , and 4-66. Optionally, said 
control population is a trait negative population, or a random population. Optionally, each of said 
30 genotyping steps a) and b) is performed on a single pooled biological sample derived from each of 
said populations. Optionally, each of said genotyping of steps a) and b) is performed separately on 
biological samples derived from each individual in said population or a subsample thereof. 
Optionally, said phenotype is a disease, cancer or prostate cancer; a response to an anti-cancer agent 
or an anti-prostate cancer agent; or a side effect to an anti-cancer or anti-prostate cancer agent. 
35 Optionally, said method comprises the additional steps of determining the phenotype in said trait 
positive and said control populations prior to step c). 
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An additional embodiment of the present invention encompasses methods of estimating the 
frequency of a haplotype for a set of biallelic markers in a population, comprising the steps of: a) 
genotyping at least one PG I -related biallelic marker for both copies of said set of biallelic marker 
present in the genome of each individual in said population or a subsan^le thereof, according to a 
5 genotyping method of the invention; b) genotyping a second biallelic marker by determining the 
identity of the allele at said second biallelic marker for both copies of said second biallelic marker 
present in the genome of each individual in said population or said subsample, according to a 
genotyping method of the invention; and c) applying a haplotype determination method to the 
identities of the nucleotides determined in steps a) and b) to obtain an estimate of said frequency. In 
10 addition, the methods of estimating the frequency of a haplof;^pe of the invention encompasls methods 
with any further limitation described in this disclosure, or those following: Optionally, said PGl- 
related biallelic marker is a PGl-related biallelic markers positioned in SEQ JD NO: 179; a PGl- 
related biallelic marker selected from the group consisting of 99-1485/251, 99-622/95, 99-619/141, 4- 
76/222, 4-77/151, 4-71/233, 4-72/127, 4-73/134, 99-610/250, 99-609/225, 4-90/283, 99-602/258. 99- 
75 600/492, 99-598/130, 99-217/277, 99-576/421, 4-61/269, 4-66/145, and 4-67/40; or a PGl-related 
biallelic marker selected from the group consisting of 99-622, 4-77 , 4-71 , 4-73 , 99-598 , 99-576 , 
and 4-66. Optionally, said second biallelic marker is a PGl-related biallelic marker; a PGl-related 
biallelic markers positioned in SEQ ID NO: 179; a PGl-related biallelic marker selected from the 
group consisting of 99-1485/251, 99-622/95, 99-619/141, 4-76/222, 4-77/151. 4-71/233. 4-72/127, 4- 
20 73/134, 99-610/250, 99-609/225. 4-90/283, 99-602/258, 99-600/492, 99-598/130, 99-217/277. 99- 
576/421, 4-61/269, 4-66/145, and 4-67/40; or a PGl-related biallelic marker selected from the group 
consisting of 99-622, 4-77 , 4-71 , 4-73 , 99-598 , 99-576 . and 4-66. Optionally, said PGl-related 
biallelic marker and said second biallelic marker are 4-77/151 and 4-66/145. Optionally, said 
haplotype determination method is an expectation-maximization algorithm. 
25 An additional embodiment of the present invention encompasses methods of detecting an 

association between a haplotype and a phenotype, comprising the steps of: a) estimating the 
frequency of at least one haplotype in a trait positive population, according to a method of the 
invention for estimating the frequency of a haplotype; b) estimating the frequency of said haplotype in 
a control population, according to a method of the invention for estimating the frequency of a 
30 haplotype; and c) determining whether a statistically significant association exists between said 
haplotype and said phenotype. In addition, the methods of detecting an association between a 
haplotype and a phenotype of the invention encompass methods with any further limitation described 
in this disclosure, or those following: Optionally, said PGl-related biallelic is a PGl-related biallelic 
markers positioned in SEQ ID NO: 179; a PGl-related biallelic marker selected from the group 
35 consisting of 99-1485/251. 99-622/95, 99-619/141, 4-76/222, 4-77/151, 4-71/233, 4-72/127, 4-73/134. 
99-610/250, 99-609/225, 4-90/283, 99-602/258, 99-600/492, 99-598/130, 99-217/277, 99-576/421, 4- 
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61/269, 4-66/145, and 4-67/40; or a PGl -related biallelic marker selected from the group consisting of 
99-622, 4-77 . 4-71 , 4-73 , 99-598 , 99-576 . and 4-66. Optionally, said PGl-related biallelic marker 
and said second biallelic marker are 4-77/151 and 4-66/145. Optionally, said haplotype exhibits a p- 
value of < Ix 10'^ in an association with a trait positive population with cancer, preferably prostate 

5 cancer. Optionally, said control population is a trait negative population, or a random population. 
Optionally, said phenotype is a disease, cancer or prostate cancer; a response to an anti-cancer agent 
or an anti-prostate cancer agent, or a side effects to an anti-cancer or anti-prostate cancer agent. 
Optionally, said method comprises the additional steps of determining the phenotype in said trait 
positive and said control populations prior to step c). 

10 Additional embodiments and aspects of tht present invention are set forth in the DetaUed 

Description of the Invention and the Examples. 

Brief Description of the Drawings 

Figure 1 is a diagram showing the BAG con tig containing the PGl gene and the positions of 
75 biallelic markers along the contig. 

Figure 2 is a graph showing the results of the first screening of a prostate cancer association 
study and the significance of various biallelic markers as measured by their chi squared and p- values 
for a low density set of markers. 

Figure 3 is a graph showing the results of the first screening of a prostate cancer association 
20 study and the significance of various biallelic markers as measured by their chi squared and p-values 
for a higher density set of markers. 

Figure 4 is a table demonstratuig the results of an haplotype analysis. Among all the 
theoretical potential different haplotypes based on 2 to 9 markers, 11 haplotypes showing a strong 
association with prostate cancer were selected, and their haplotype analysis results are shown here. 
25 Figure 5 is a bar graph demonstrating the results of an experiment evaluating the significance 

(p-values) of the haplotype analysis shown in Figure 4. 

Figure 6A is a table listing the biallelic markers used in the haplotype analysis of Figure 4. 
Figure 6B is a table listing additional biallelic markers in linkage disequilibrium with the PGl gene. 

Figure 7 is a table listing the positions of exons, splice sites, a stop codon, and a poly A site in 
30 the PGl gene. 

Figure 8 A is a diagram showing the genomic stmcture of PGl in comparison with its most 
abundant mRNA transcript. Figure 8B is a more detailed diagram showing the genomic structure of 
PGl, including exons and introns. 

Figure 9 is a table listing some of the homologies between the PGl protein and known 
35 proteins. 

Figure 10 is a half-tome reproduction of a fluorescence micrograph of the perinuclear/nuclear 
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expression of PGl in tumoral (PC3) and normal prostatic cell lines (PNT2). Vector "PGl": includes 
all the coding exons from exon 1 to 8. For PC3 (upper panel) and PNT2 (lower panel), the nucleus 
was labelled with Propidium iodide (IP, left panel). Note that EGFP fluorescence was detected in and 
around the nucleus (GFP, middle panel), as shown when the two pictures were overlapped (right 
5 panel). 

Figure 1 1 is a half-tome reproduction of a fluorescence micrograph of the perinuclear/nuclear 
expression of PGl/1-4 in tumoral (PC3) and normal prostatic cell lines (PNT2). Vector "PGl/1-4" 
corresponds to an alternative messenger which is due to an alternative splicing, joining exon 1 to 
exon 4, and resulting in the absence of exons 2 and 3. For PC3 (upper panel) and PNT2 (lower panel), 
10 the nucleus was labelled with Propidium iodide (IP, left panel). Note that EGFP fluorescence was 
detected in and around the nucleus (GFP, middle panel), as shown when the two pictures were 
overlapped (right panel). 

Figure 12 is a half-tome reproduction of a fluorescence micrograph of the perinuclear/nuclear 
expression of PGl/1-5 in tumoral prostatic cell line (PC3) and cytoplasmic expression of PGl/1-5 in 
15 normal prostatic cell line (PNT2). Vector "TGl/l-S" corresponds to an alternative messenger which is 
due to an alternative splicing, joining exon 1 to exon 5, and resulting in the absence of exons 2, 3 and 
4. For PCS (upper panel) and PNT2 (lower panels), the nucleus was labelled with Propidium iodide 
(IP). Note that in PC3 cells, EGFP fluorescence was detected in and around the nucleus (GFP, upper 
middle panel), as shown when the two picture were overlapped (upper right panel). In PNT2A cells, 
20 EGFP fluorescence was detected in the cytoplasm (GFP, lower left panel), as shown when the two 
pictures were overlapped (lower right panel). 

Figure 13 is a half-tome reproduction of a fluorescence micrograph of the perinuclear/nuclear 
expression of a mutated form PGl (PGlmut229) in normal prostatic cell line (PNT2). Vector 
"PGl/1-7" includes exons 1 to 6, and corresponds to the mutated form identified in genomic DNA of 
25 the prostatic tumoural cell line LNCaP. The nucleus was labelled with Propidium iodide (IP, left 
panel). EGFP fluorescence was detected in the cytoplasm (GFP, middle panel), as shown when the 
two pictures were overlapped (lower right panel). 

Figure 14 is a diagram of the structure of the 14 alternative splice species found for human 
PGl by the exons present. An * indicates that there is a stop codon in frame at that location. An 
30 arrow to the right at the right-hand side of a splice species indicates that the open-reading frame 
continues off of the chart, a space between exons indicates that the exon(s) is missing from that 
particular alternative splice species. An up arrow indicates that either exon Ibis, 3bis, or 5bis has 
been inserted depending upon which is indicated. A bracket notation in exon 6, over an exon 6bis 
notation indicates that the first 60 bases is missing from exon 6, and exon 6bis is therefor present as a 
35 trancated form of exon 6. 

Figure 15 is a table listing the results of a series of RT-PCR experiments that were performed 
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on RNA of normal prostate, normal prostatic cell lines (POTIA, PNTIB and PNT2), and tumoral 
prostatic cell lines (LnCaPFCG, LnCaPJMB, CaHPV, Dul45, PC3, and prostate tumors (ECP5 to 
ECP24) using all the possible combinations of primers (SEQ ED NOs: 137-178) specific to all of the 
possible splice junctions or exon borders in human PGl. An NT indicates that the experiment was 
5 not performed. An [+] indicates the use of an alternative splice species with exons I. 3, 4, 7, and 8. 

Figure 16 is a graph showing the results of association studies using markers spanning the 650 
kb region of the 8p23 locus around PGl, using both single point analysis and haplotyping studies. 

Figure 17 is a graph showing an enlarged view of the single point association results within a 
160 kb region comprising the PGl gene. 

10 Figure 18A is a graph showing an enlarged view of the single point association results of 40 

kb within the PGl gene. Figure 18B is a table listing the location of markers within PGl gene, the two 
possible alleles at each site. For each marker, the disease-associated allele is indicated first; its 
frequencies in cases and controls as well as the difference between both are shown; the odd-ratio and 
the p-value of each individual marker association are also shown. 

75 Figure 19A is a table showing the results of a haplotype analysis study using 4 markers 

(marker Nos. 4-14. 99-217, 4-66 and 99-221) ) vrithin the 160 kb region shown in Figure 17. Figure 
19B is a table showing the segmented haplotyping results according to the subject's age, and whether 
the prostate cancer cases were sporadic or familial, using the same markers 4 markers and the same 
individuals as were used to generate the results in Figure 19 A. 

20 Figure 20 is a table listing the haplotyping results and odd ratios for combinations of the 7 

markers (99-622 ; 4-77 ; 4-71 ; 4-73 ; 99-598 ; 99-576 ; 4-66) within PGl gene that were shown in 
Figure 18 to have p-values more significant than 1.10'^. All of the 2-, 3-, 4-, 5-, 6- and 7-marker 
haplotypes were tested. 

Figure 21 is a graph showing the distribution of statistical significance, as measured by Chi- 
25 square values, for each series of possible x-marker haplotypes, (x =2, 3 or 4) using all of the 19 
markers listed in Figure 18B. 

Detailed Description of the Preferred Embodiment 

The practice of the present invention encompasses conventional techniques of chemistry, 

30 immunology, molecular biology, biochemistry, protein chemistry, and recombinant DNA technology, 
which are within the skill of the art. Such techniques are explained fiiUy in the literature. See, e.g. . 
Oligonucleo tide Synthesis (M. Gait ed. 1984); Nucleic Acid Hybridization (B. Hames & S. Higgins, 
eds., 1984); Sambrook, Fritsch & Maniatis, Molecular Cloning: A Laboratory Manual . Second 
Edition (1989); PGR Technology (H.A. Erlich ed., Stockton Press); R. Scope, Protein Purification 

35 Principles and Pi^actice (Springer- Ver lag); and the series Methods in Enzvmologv (S. Colowick and 
N. Kaplan eds., Academic Press, Inc.). 
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Definitions 

As used interchangeably herein, the terms "nucleic acid" "oligonucleotide", and 
"polynucleotides" include RNA, DNA, or RNA/DNA hybrid sequences of more than one nucleotide 
in either single chain or duplex form. The term "nucleotide" as used herein as an adjective to 

5 describe molecules comprising RNA, DNA, or RNA/DNA hybrid sequences of any length in single- 
stranded or duplex form. The term "nucleotide" is also used herein as a noun to refer to individual 
nucleotides or varieties of nucleotides, meaning a molecule, or individual unit in a larger nucleic acid 
molecule, comprising a purine or pyrimidine, a ribose or deoxyribose sugar moiety, and a phosphate 
group, or phosphodiester linkage in the case of nucleotides within an oligonucleotide or 

10 polynucleotide. Although the term "nucleotide" is also^ used herein to encompass ^"modified 
nucleotides" which comprise at least one modifications (a) an alternative linking group, (b) an 
analogous form of purine, (c) an analogous form of pyrimidine, or (d) an analogous sugar, for 
examples of analogous linking groups, piuine, pyrimidines, and sugars see for example PCT 
publication No. WO 95/04064, However, the polynucleotides of the invention are preferably 

15 comprised of greater than 50% conventional deoxyribose nucleotides, and most preferably greater 
than 90% conventional deoxyribose nucleotides. The polynucleotide sequences of the invention is 
prepared by any known method, including synthetic, recombinant, ex vivo generation, or a 
combination thereof, as well as utilizing any purification methods known in the art. 

As used herein, the term "purified" does not require absolute purity; rather, it is intended as a 

20 relative definition Purification of starting material or natural material to at least one order of magnitude, 
preferably two or three orders, and more preferably four or five orders of magnitude is expressly 
contemplated. 

The term "purified" is used herein to describe a polynucleotide or polynucleotide vector of 
the invention which has been separated from other compounds including, but not limited to other 

25 nucleic acids, charbohydrates, lipids and proteins (such as the enzymes used in the synthesis of the 
polynucleotide), or the separation of covalently closed polynucleotides from linear polynucleotides. 
A polynucleotide is substantially pure when at least about 50 %, preferably 60 to 75% of a sample 
exhibits a single polynucleotide sequence and conformation (linear versus covalently close). A 
substantially pure polynucleotide typically comprises about 50 %, preferably 60 to 90% 

30 weight/weight of a nucleic acid sample, more usually about 95%, and preferably is over about 99% 
pure. Polynucleotide purity or homogeneity is indicated by a number of means well known in the art, 
such as agarose or polyacrylamide gel electrophoresis of a sample, followed by visualizing a single 
polynucleotide band upon staining the gel. For certain purposes higher resolution can be provided by 
using HPLC or other means well known in the art. 

35 The term "polypeptide" refers to a polymer of amino without regard to the length of the 

polymer; thus, peptides, oligopeptides, and proteins are included within the definition of polypeptide. 
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This term also does not specify or exclude prost-expression modifications of polypeptides, for 
example, polypeptides which include the covalent attachment of glycosyl groups, acetyl groups, 
phosphate groups, lipid groups and the like are expressly encompassed by the term polypeptide. Also 
included within the definition are polypeptides which contain one or more analogs of an amino acid 
5 (including, for example, non-naturally occurring amino acids, amino acids which only occur naturally 
in an unrelated biological system, modified amino acids from mammalian systems etc.), polypeptides 
with substituted linkages, as well as other modifications known in the art, both naturally occurring 
and non-natiirally occurring. 

As used herein, the term "isolated" requires that the material be removed from its original 
10 environment.(e.g., the natural environment if it is naturally occurring). 

The term "purified" is used herein to describe a polypeptide of the invention which has been 
separated from other compounds including, but not limited to nucleic acids, lipids, charbohydates and 
other proteins. A polypeptide is substantially pure when at least about 50 %, preferably 60 to 75% of 
a sample exhibits a single polypeptide sequence. A substantially pure polypeptide typically comprises 
15 about 50 %, preferably 60 to 90% weight/weight of a protein sample, more usually about 95%. and 
preferably is over about 99% pure. Polypeptide purity or homogeneity is indicated by a number of 
means well known in the art, such as agarose or polyacrylamide gel electrophoresis of a sample, 
followed by visualizing a single polypeptide band upon staming the gel. For certain purposes higher 
resolution can be provided by using HPLC or other means well known in the art. 
20 As used herein, the term "non-human animal" refers to any non-human vertebrate, birds and 

more usually mammals, preferably primates, farm animals such as swine, goats, sheep, donkeys, and 
horses, rabbits or rodents, more preferably rats or mice. As used herein, the term "animal" is used to 
refer to any vertebrate, preferable a mammal. Both the terms "animal" and "manunal" expressly 
embrace human subjects unless preceded with the term "non-human". 
25 As used herein, the term "antibody" refers to a polypeptide or group of polypeptides which 

are comprised of at least one binding domain, where an antibody binding domain is formed from the 
foldmg of variable domains of an antibody molecule to form three-dimensional binding spaces with 
an internal surface shape and charge distribution complementary to the features of an antigenic 
determinant of an antigen., which allows an immunological reaction with the antigen. Antibodies 
30 include recombinant proteins comprising the binding domains, as wells as fragments, including Fab, 
Fab', F(ab)2. and F(ab')2 fragments. 

As used herein, an "antigenic determinant" is the portion of an antigen molecule, in this case 
an PGl polypeptide, that determines the specificity of the antigen-antibody reaction. An "epitope" 
refers to an antigenic determinant of a polypeptide. An epitope can comprise as few as 3 amino acids 
35 in a spatial conformation which is unique to the epitope. Generally an epitope consists of at least 6 
such amino acids, and more usually at least 8-10 such amino acids. Methods for determining the 
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amino acids which make up an epitope include x-ray crystallography, 2-dimensional nuclear magnetic 
resonance, and epitope mapping e.g. the Pepscan method described by H. Mario Geysen et al. 1984. 
Proc. Natl. Acad. Sci. U.S.A. 81:3998-4002; PCT Publication No. WO 84/03564; and PCT 
Publication No. WO 84/03506. 
5 The term "DNA construct" and "vector" are used herein to mean a purified or isolated 

polynucleotide that has been artificially designed and which comprises at least two nucleotide 
sequences that are not found as contiguous nucleotide sequences in their natural environment. 

The terms "trait" and "phenotype" are used interchangeably herein and refer to any visible, 
detectable or otherwise measurable property of an organism such as symptoms of, or susceptibility to 
10 a disease for example. Typically the terms "trait" or "phenotype" are* used herein to refer to 
symptoms of, or susceptibility to cancer or prostate cancer; or to refer to an individual's response to 
an anti-cancer agent or an anti-prostate cancer agent; or to refer to symptoms of, or susceptibility to 
side effects to an anticancer agent or an anti-prostate cancer agent. 

The term "allele" is used herein to refer to variants of a nucleotide sequence. A biallelic 
75 polymorphism has two forms. Typically the first identified allele is designated as the original allele 
whereas other alleles are designated as alternative alleles. Diploid organisms is homozygous or 
heterozygous for an allelic form. 

The term "heterozygosity rate" is used herein to refer to the incidence of individuals in a 
population, which are heterozygous at a particular allele. In a biallelic system the heterozygosity rate 
20 is on average equal to 2Pa(l-Pa), where Pa is the frequency of the least common allele. In order to be 
useful in genetic studies a genetic marker should have an adequate level of heterozygosity to allow a 
reasonable probability that a randomly selected person will be heterozygous. 

The term "genotype" as used herein refers the identity of the alleles present in an individual or 
a sample. In the context of the present invention a genotype preferably refers to the description of the 
25 biallelic marker alleles present in an individual or a sample. The term "genotyping" a sample or an 
individual for a biallelic marker consists of determining the specific allele or the specific nucleotide 
carried by an individual at a biallelic marker. 

The term "mutation" as used herein refers to a difference in DNA sequence between or among 
different genomes or individuals which has a frequency below 1%. 
30 The term "haplotype" refers to a combination of alleles present in an individual or a sample. 

In the context of the present invention a haplotype preferably refers to a combination of biallelic 
marker alleles found in a given individual and which is associated with a phenotype. 

The term "polymorphism" as used herein refers to the occurrence of two or more alternative 
genomic sequences or alleles between or among different genomes or individuals. "Polymorphic" 
35 refers to the condition in which two or more variants of a specific genomic sequence can be found in a 
population. A "polymorphic site" is the locus at which the variation occurs. A single nucleotide 
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polymorphism is a single base pair change. Typically a single nucleotide polymorphism is the 
replacement of one nucleotide by another nucleotide at the polymorphic site. Deletion of a single 
nucleotide or insertion of a single nucleotide, also give rise to single nucleotide polymorphisms. In 
the context of the present invention "single nucleotide polymorphism" preferably refers to a single 
5 nucleotide substitution. Typically, between different genomes or between different individuals, the 
polymorphic site is occupied by two different nucleotides. 

The terms "biallelic polymorphism" and "biallelic marker" are used interchangeably herein to 
refer to a nucleotide polymorphism having two alleles at a fairly high frequency in the population. A 
"biallelic marker allele" refers to the nucleotide variants present at a biallelic marker site. Usually a 
10 biallelic marker is a single nucleotide polymorphism. However, les^ commonly there are also 
insertions and deletions of up to 5 nucleotides which constitute biallelic markers for the purposes of 
the present invention. Typically the frequency of the less conmion allele of the biallelic markers of 
the present invention has been validated to be greater than 1%, preferably the frequency is greater 
than 10%. more preferably the frequency is at least 20% (i.e. heterozygosity rate of at least 0.32), 
75 even more preferably the frequency is at least 30% (i.e. heterozygosity rate of at least 0.42). A 
biallelic marker wherein the frequency of the less common allele is 30% or more is termed a "high 
quality biallelic marker." 

The location of nucleotides in a polynucleotide with respect to the center of the 
polynucleotide are described herein in the following maimer. When a polynucleotide has an odd 
20 number of nucleotides, the nucleotide at an equal distance from the 3' and 5' ends of the 
polynucleotide is considered to be "at the center" of the polynucleotide, and any nucleotide 
immediately adjacent to the nucleotide at the center, or the nucleotide at the center itself is considered 
to be "within 1 nucleotide of the center." With an odd number of nucleotides in a polynucleotide any 
of the five nucleotides positions in the middle of the polynucleotide would be considered to be within 
25 2 nucleotides of the center, and so on. When a polynucleotide has an even number of nucleotides, 
there would be a bond and not a nucleotide at the center of the polynucleotide. Thus, either of the two 
central nucleotides would be considered to be "within 1 nucleotide of the center" and any of the four 
nucleotides in the middle of the polynucleotide would be considered to be "within 2 nucleotides of the 
center", and so on. 

30 The term '^upstream" is used herein to refer to a location which is toward the 5' end of the 

polynucleotide from a specific reference point. 

The terms "base paired" and "Watson & Crick base paired" are used interchangeably herein 

to refer to nucleotides which can be hydrogen bonded to one another be virtue of their sequence 

identities in a manner like that found in double-helical DNA with thymine or uracil residues linked to 
35 adenine residues by two hydrogen bonds and cytosine and guanine residues linked by three hydrogen 

bonds (See Stryer. L.. Biochemistry, 4* edition, 1995). 
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The terms "complementary" or "complement thereof are used herein to refer to the 
sequences of polynucleotides which is capable of formmg Watson & Crick base pairing with another 
specified polynucleotide throughout the entirety of the complementary region. This term is applied to 
pairs of polynucleotides based solely upon their sequences and not any particular set of conditions 
5 under which the two polynucleotides would actually bind. 

As used herein the term "PGl-related biallelic marker" relates to a set of biallelic markers in 
linkage disequilibrium with PGl. The term PGl-related biallelic marker includes all of the biallelic 
markers used in the initial association studies shown below in Section I.D., including those biallelic 
markers contained in SEQ ED NOs: 21-38 and 57-62. The term PGl-related biallelic marker 
10 encompasses all of the following polynaorphisms positioned in SEQ ID 179, and listed by internal * 
reference number, including: 5-63-169 G or C in position 2159; 
5-63-453 C or T in position 2443; 99-622-95 T or C in position 4452; 
99-621-215 T or C in position 5733; 99-619-141 G or A in position 8438; 
4-76-222 deletion of GT in position 1 1843; 4-76-361 C or T in position 1 1983; 
15 4-77-151 G or C in position 12080; 4-77-294 A or G in position 12221 ; 
4-71-33 G or T in position 12947;4-71-233 A or G in position 13147; 
4-71-280 G or A in position 13194; 4-71-396 G or C in position 13310; 
4-72-127 A or G in position 13342; 4-72-152 A or G in position 13367; 
4-72-380 deletion of A in position 13594; 4-73-134 G or C in position 13680; 
20 4-73-356 G or C in position 13902; 99-610-250 T or C in position 16231; 
99-610-93 A or T in position 16388; 99-609-225 A or T in position 17608; 
4-90-27 A or C in position 18034; 4-90-283 A or C in position 18290; 
99-607-397 T or C in position 18786; 99-602-295 deletion of A in position 22835; 
99-602-258 T or C in position 22872; 
25 99-600-492 deletion of TATTG in position 25 183; 

99-600-483 T or G in position 25192; 5-23-288 A or G in position 25614; 
99-598-130 T or C in position 26911; 99-592-139 A or T in position 32703; 
99-217-277 C or T in position 34491; 5-47-284 A or G in position 34756; 
99-589-267 T or G in position 34934; 99-589-41 G or C in position 35160; 
30 99-12899-307 C or T in position 39897; 4-12-68 A or G in position 40598; 
99-582-263 T or C in position 40816; 99-582-132 T or C in position 40947; 
99-576-421 G or C in position 45783; 4-13-51 C or T in position 47929; 

4- 13-328 A or T in position 48206; 4-13-329 G or C in position 48207; 
99-12903-381 C or T in position 49282; 5-56-208 A or G in position 50037; 

35 5-56-225 A or G in position 50054; 5-56-272 A or G in position 50101; 

5- 56-391 G or T in position 50220; 4-61-269 A or G in position 50440; 
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4-61-391 A or G in position 50562; 4-63-99 A or G in position 50653; 
4-62-120 A or G in position 50660; 4-62-205 A or G in position 50745; 

4- 64-113 A or T in position 50885; 4-65-104 A or G in position 51249; 

5- 28-300 A or G in position 51333; 5-50-269 C or T in position 51435; 
5 4-65-324 C or T in position 51468; 5-71-129 G or C in position 51515; 

5-50-391 G or C in position 51557; 5-71-180 A or G in position 51566; 

4- 67-40 C of T in position 51632; 5-71-280 A or C in position 51666; 

5- 58-167 A or G in position 52016; 5-30-325 C or T in position 52096; 
5-58-302 A or T in position 52151; 5-31-178 A or G in position 52282; 

10 5-31-244 A or G in position 52348; 5-31-306 deletion of A in position 52410; 

5-32-190 C or T in position 52524; 5-32-246 C or T in position 52580; 

5-32-378 deletion of A in position 52712; 5-53-266 G or C in position 52772; 

5-60-158 C or T in position 52860; 5-60-390 A or G in position 53092; 

5-68-272 G or C in position 53272; 5-68-385 A or T in position 53389; 
75 5-66-53 deletion of GA in position 5351 1; 5-66-142 G or C in position 53600; 

5-66-207 A or G in position 53665; 5-37-294 A or G in position 53815; 

5-62-163 insertion of A in position 54365; 5-62-340 A or T in position 54541; and the compliments 

thereof. The term PGl-related biallelic marker also includes all of the following biallelic markers 

listed by internal reference number, and two SEQ ID NOs each of which contains a 47-mers with one 
20 of the two alternative bases at position 24; 

4-14-107 of SEQ ID NOs 185 and 262; 4-14-317 of SEQ ID NOs 186 and 263; 4-14-35 of 

SEQ ID NOs 187 and 264; 4-20-149 of SEQ ID NOs 188 and 265; 

4-20-77 of SEQ ID NOs 189and 266; 4-22-174 of SEQ ID NOs 190 and 267; 

4-22-176 of SEQ ID NOs 191 and 268; 4-26-60 of SEQ ID NOs 192 and 269; 
25 4-26-72 of SEQ ID NOs 193 and 270; 4-3-130 of SEQ ID NOs 194 and 271; 

4-38-63 of SEQ ID NOs 195 and 272; 

4-38-83 of SEQ ID NOs 196 and 273; 4-4-152 of SEQ ID NOs 197 and 274; 

4-4-187 of SEQ ID NOs 198 and 275; 4-4-288 of SEQ ID NOs 199 and 276; 

4-42-304 of SEQ ID NOs 200 and 277; 4-42-401 of SEQ ID NOs 201 and 278; 
30 4-43-328 of SEQ ID NOs 202 and 279; 4-43-70 of SEQ ID NOs 203 and 280; 

4-50-209 of SEQ ID NOs 204 and 281; 4-50-293 of SEQ ID NOs 205 and 282; 

4-50-323 of SEQ ID NOs 206 and 283; 4-50-329 of SEQ ID NOs 207 and 284; 

4-50-330 of SEQ ID NOs 208 and 285; 4-52-163 of SEQ ID NOs 209 and 286; 

4-52-88 of SEQ ID NOs 210 and 287; 4-53-258 of SEQ ID NOs 21 1 and 288; 
35 4-54-283 of SEQ ID NOs 212 and 289; 4-54-388 of SEQ ID NOs 213 and 290; 

4-55-70 of SEQ ID NOs 214 and 291 ; 4-55-95 of SEQ ID NOs 215 and 292; 
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4-56-159 of SEQ ID NOs 216 and 293; 4-56-213 of SEQ ID NOs 217 and 294; 

4-58-289 of SEQ ID NOs 218 and 295; 4-58-318 of SEQ ID NOs 219 and 296; 

4-60-266 of SEQ ID NOs 220 and 297; 4-60-293 of SEQ ID NOs 221 and 298; 

4-84-241 of SEQ ID NOs 222 and 299; 4-84-262 of SEQ ID NOs 223 and 300; 
5 4-86-206 of SEQ ID NOs 224 and 301; 4-86-309 of SEQ ID NOs 225 and 302; 

4-88-349 of SEQ ID NOs 226 and 303; 4-89-87 of SEQ ID NOs 227 and 304; 

99-123-184 of SEQ ID NOs 228 and 305; 99-128-202 of SEQ ID NOs 229 and 306; 

99-128-275 of SEQ ID NOs 230 and 307; 99-128-313 of SEQ ID NOs 231 and 308; 99-128-60 of 

SEQ ID NOs 232 and 309; 99-12907-295 of SEQ ID NOs 233 and 310; 99-130-58 of SEQ ID NOs 
10 234 and311; 99-134-362 ofSEQ ID NOs 235 and 312; '■ ^■ 

99-140-130 of SEQ ID NOs 236 and 313; 99-1462-238 of SEQ ID NOs 237 and 314; 99-147-181 of 

SEQ ID NOs 238 and 315; 99-1474-156 of SEQ ID NOs 239 and 316; 99-1474-359 of SEQ ID 

NOs 240 and 317; 99-1479-158 of SEQ ID NOs 241 and 318; 99-1479-379 of SEQ ID NOs 242 and 

319; 99-148-129 of SEQ ID NOs 243 and 320; 99-148-132 of SEQ ID NOs 244 and 321; 99-148-139 
15 of SEQ ID NOs 245 and 322; 

99-148-140 of SEQ ID NOs 246 and 323; 99-148-182 of SEQ ID NOs 247 and 324; 

99-148-366 of SEQ ID NOs 248 and 325; 99-148-76 of SEQ ID NOs 249 and 326; 

99-1480-290 of SEQ ID NOs 250 and 327; 99-1481-285 of SEQ ID NOs 25 1 and 328; 99-1484-101 of 

SEQ ID NOs 252 and 329; 99-1484-328 of SEQ ID NOs 253 and 330; 99-1485-251 of SEQ ID NOs 
20 254 and 331; 99-1490-381 of SEQ ID NOs 255 and 332; 99-1493-280 of SEQ ID NOs 256 and 

333; 99-151-94 of SEQ ID NOs 257 and 334; 

99-21 1-291 of SEQ ID NOs 258 and 335; 99-213-37 of SEQ ID NOs 259 and 336; 

99-221-442 of SEQ ID NOs 260 and 337; 99-222-109 of SEQ ID NOs 261 and 338; and the 

compliments thereof. 

25 The term "non-genic" is used herein to describe PGl-related biallelic markers, as well as 

polynucleotides and primers which do not occur in the human PGl genomic sequence of SEQ ID NO: 
179. The term "genie" is used herein to describe PGl-related biallelic markers as well as 
polynucleotides and primers which do occur in the human PGl genomic sequence of SEQ ID NO: 
179. 

30 The terms "an anti-cancer agent" refers to a dmg or a compound that is capable of reducing 

the growth rate, rate of metastasis, or viability of tumor cells in a mammal, is capable of reducing the 
size or eliminating tumors in a mammal, or is capable of increasing the average life span of a mammal 
or human with cancer. Anti-cancer agents also include compounds which are able to reduce the risk 
of cancer developing in a population, particularly a high risk population. The terras "an anti-prostate 

35 cancer agent" is an anti-cancer agent that has these effects on ceUs or tumors that are derived from 
prostate cancer cells. 
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The terms "response to an anti-cancer agent" and "response to an anti- prostate cancer agent" 
refer to drug efficacy, including but not limited to ability to metabolize a compound, to the ability to 
convert a pro-drug to an active drug, and to the pharmacokinetics (absorption, distribution, 
elimination) and the pharmacodynamics (receptor-related) of a drug in an individual. 
5 The terms "side effects to an anti-cancer agent" and "side effects to an anti-prostate cancer 

agent" refer to adverse effects of therapy resuhing from extensions of the principal pharmacological 
action of the drug or to idiosyncratic adverse reactions resulting from an interaction of the drug with 
unique host factors. These side effects include, but are not limited to, adverse reactions such as 
dermatological, hematological or hepatological toxicities and further includes gastric and intestinal 

10 ulceration, disturbance in platelet function, renal iiijury, nephritis, vasomotor rhinitis with profuse 
watery secretions, angioneurotic edema, generalized urticaria, and bronchial asthma to laryngeal 
edema and bronchoconstriction, hypotension, sexual dysfunction, and shock. 

As used herein the term "homology" refers to comparisons between protein and/or nucleic 
acid sequences and is evaluated using any of the variety of sequence comparison algorithms and 

15 programs known in the art. Such algorithms and programs include, but are by no means limited to, 
TBLASTN, BLASTP. FASTA, TFASTA, and CLUSTALW (Pearson and Lipman, 1988, Proc. Natl, 
Acad. Sci. USA 85(8):2444-2448; Altschul et al., 1990, J. Mol. Biol. 215(3):403-410; Thompson et 
al., 1994, Nucleic Acids Res. 22(2):4673-4680; Higgins et al., 1996, Methods Enzymol. 266:383-402; 
Altschul et al, 1990, J. Mol. Biol. 215(3):403-410; Altschul et al, 1993, Nature Genetics 3:266-272). 

20 In a particularly preferred embodiment, protein and nucleic acid sequence homologies are evaluated 
using the Basic Local Alignment Search Tool ("BLAST") which is well known in the art (see, e.g., 
Karlin and Altschul. 1990, Proc. Natl. Acad. Sci. USA 87:2267-2268; Altschul et al., 1990, J. Mol. 
BioL 215:403-410; Altschul et al., 1993, Nature Genetics 3:266-272; Altschul et al., 1997, Nuc. Acids 
Res. 25:3389-3402). In particular, five specific BLAST programs are used to perform the following 

25 task: 

(1) BLASTP and BLAST3 compare an amino acid query sequence against a 
protein sequence database; 

(2) BLASTN compares a nucleotide query sequence against a nucleotide 
sequence database; 

30 (3) BLASTX compares the six-frame conceptual translation products of a 

query nucleotide sequence (both strands) against a protein sequence 
database; 

(4) TBLASTN compares a query protein sequence against a nucleotide 
sequence database translated in all six reading frames (both strands); and 
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(5) TBLASTX compares the six-frame translations of a nucleotide query 
sequence against the six-frame translations of a nucleotide sequence 
database. 

The BLAST programs identify homologous sequences by identifying similar segments, which are 
J referred to herein as "high-scoring segment pairs," between a query amino or nucleic acid sequence 
and a test sequence which is preferably obtained from a protein or nucleic acid sequence database. 
High-scoring segment pairs are preferably identified (i.e.. aligned) by means of a scoring matrix, 
many of which are known in the art. Preferably, the scoring matrix used is the BLOSUM62 matrix 
(Gonnet et al., 1992, Science 256:1443-1445; Henikoff and Henikoff, 1993, Proteins 17:49-61). Less 
70 preferably, the PAM or PAM250 matriceSs may also be used (see, e.g., "Schwartz and Dayhoff, eds., 
1978, Matrices for Detecting Distance Relationships: Atlas of Protein Sequence and Structure, 
Washington: National Biomedical Research Foundation). The BLAST programs evaluate the 
statistical significance of all high-scoring segment pairs identified, and preferably selects those 
segments which satisfy a user-specified threshold of significance, such as a user-specified percent 
15 homology. Preferably, the statistical significance of a high-scoring segment pair is evaluated using 
the statistical significance formula of Karlin (see, e.g., Karlin and Altschul, 1990, Proc. Natl. Acad. 
Sci. USA 87:2267-2268). 

I- ISOLATION AND CHARACTERIZATION OF THE PGl GENE AND PROTEINS 
I.A, The 8d23 Region- LOH Studies: Implications of 8p23 Region in Distinct Cancer Types 

20 Substantial amounts of LOH data support the hypothesis that genes associated with distinct 

cancer types are located within 8p23 region of the human genome. Emi et al., demonstrated the 
implication of 8p23.1-8p21.3 region in cases of hepatocellular carcinoma, colorectal cancer, and non- 
small cell lung cancer. (Emi M, Fujiwara Y. Nakajima T, Tsuchiya E, Tsuda H, Hirohashi S, Maeda 
Y, Tsuruta K, Miyaki M, Nakamura Y, Cancer Res. 1992 Oct 1; 52(19): 5368-5372) Yaremko, et al„ 

25 showed the existence of two major regions of LOH for chromosome 8 markers in a sample of 87 
colorectal carcinomas. The most prominent loss was found for 8p23.1-pter, where 45% of 
informative cases demonstrated loss of alleles. (Yaremko ML, Wasylyshyn ML, Paulus KL, 
Michelassi F, Westbrook CA, Genes Chromosomes Cancer 1994 May;10(l):l-6). Scholnick et al. 
demonstrated the existence of three distinct regions of LOH for the markers of chromosome 8 in cases 

30 of squamous cell carcinoma of the supraglottic larynx. They showed that the allelic loss of 8p23 
marker D8S264 serves as a statistically significant, mdependent predictor of poor prognosis for 
patients with supraglottic squamous cell carcinoma. (Scholnick SB, Haughey BH, Sunwoo JB, el- 
Mofty SK, Baty JD, PicciriUo JF, Zequeira MR, J. Nad. Cancer Inst. 1996 Nov 20; 88(22): 1676-1682 
and Sunwoo JB, Holt MS, Radford DM, Deeker C, Schohiick S. Genes Chromosomes Cancer 1996 
35 Jul; 16(3): 164-169). 
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In other studies, Nagai et al. demonstrated the highest loss of heterozygosity in the specific 
region of 8p23 by genome wide scanning of LOH in 120 cases of hepatocellular carcinoma (HCC). 
(Nagai H, Pineau P, Tiollais P, Buendia MA, Dejean A, Oncogene 1997 Jun 19; 14(24): 2927-2933). 
Gronwald et al. demonstrated 8p23-pter loss in renal clear cell carcinomas. (Gronwald J, Storkel S, 
5 Holtgreve-Grez H, Hadaczek P, Brinkschmidt C, Jauch A, Lubinski J, Cremer, Cancer Res. 1997 Feb 
1; 57(3): 481-487). 

The same region is involved in specific cases of prostate cancer. Matsuyama et al. showed 
the specific deletion of the 8p23 band in prostate cancer cases, as monitored by FISH with D8S7 
probe. (Matsuyama H. Pan Y, Skoog L, Tribukait B, Naito K, Ekman P, Lichter P, Bergerheim US 
10 Oncogene 1994 Oct; 9(10): 3071-3076). They were able to document a substantial number of cases 
with deletions of 8p23 but retention of the 8p22 marker LPL. Moreover, Ichikawa et al. deduced the 
existence of a prostate cancer metastasis suppressor gene and localized it to 8p23-ql2 by studies of 
metastasis suppression in highly metastatic rat prostate cells after transfer of human chromosomes. 
(Ichikawa T, Nihei N. Kuramochi H, Kawana Y, Killary AM, Rinker-Schaeffer CW, Barrett JC, 
75 Isaacs JT, Kugoh H, Oshimura M, Shimazaki J, Prostate Suppl. 1996; 6: 31-35). 

Recently Washburn et al. were able to find substantial numbers of tumors with the allelic loss 
specific to 8p23 by LOH studies of 31 cases of human prostate cancer. (Washburn J, Woino K, and 
Macoska J, Proceedings of American Association for Cancer Research, March 1997; 38).. In these 
samples they were able to define the minimal overlapping region with deletions covering genetic 
20 interval D8S262-D8S277. 

Linkage Analysis Studies: Search for Prostate Cancer 
Linked Regions on Chromosome 8 
Microsatellite markers mapping to chromosome 8 were used by the inventors to perform 
linkage analysis studies on 194 individuals issued fi-om 47 families affected with prostate cancer. 
25 While multiple point analysis led to weak linkage results, two point lod score analysis led to non 
significant results, as shown below. 

Two point lod (parametric analysis) 
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In view of the non-significant results obtained with linkage analysis, a new mapping approach based 
on linkage disequilibrium of biallelic markers was utilised to identify genes responsible for sporadic 
cases of prostate cancer, 

I.B. Linkage Disequilibrium Using Biallelic Markers To Identify Candidate Loci Responsible 
5 For Disease 

Linkage Disequilibrium 
Once a chromosomal region has been identified as potentially harboring a candidate gene 
associated with a sporadic trait, an excellent approach to refine the candidate gene's location within 
the identified region is to look for statistical associations between the trait and some marker genotype 
10 when comparing an affected (trait * ) and a control (trait " ) population. 

Association studies have most usually relied on the use of biallelic markers. Biallelic markers 
are genome-derived polynucleotides that exhibit biallelic polymorphism at one single base position. 
By definition, the lowest allele frequency of a biallelic polymorphism is 1%; sequence variants that 
show allele frequencies below 1% are called rare mutations. There are potentially more than 10^ 
75 biallelic markers lying along the human genome. 

Association studies seek to establish correlations between traits and genetic markers and are 
based on the phenomenon of linkage disequilibrium (LD). LD is defined as the trend for alleles at 
nearby loci on haploid genomes to correlate in the population. If two genetic loci lie on the same 
chromosome, then sets of alleles on the same chromosomal segment (i.e., haplotypes) tend to be 
20 transmitted as a block from generation to generation. When not broken up by recombination, 
haplotypes can be tracked not only through pedigrees but also through populations. The resulting 
phenomenon at the population level is that the occurrence of pairs of specific alleles at different loci 
on the same chromosome is not random, and the deviation from random is called linkage 
disequilibrium. 

25 Since results generated by association studies are essentially based on the quantitative 

calculation of allele frequencies, they best apply to the analysis of germline mutations. This is mainly 
due to the fact that allelic frequencies are difficult to quantify within tumor tissue samples because of 
the usual presence of normal cells within the studied tumor samples. Association studies applied to 
cancer genetics will therefore be best suited to the identification of tumor suppressor genes. 

30 Trait Localization by Linkage Disequilibrium Mapping 

Any gene responsible or partly responsible for a given trait will be in LD with some flanking 
markers. To map such a gene, specific alleles of these flanking markers which are associated with the 
gene or genes responsible for the trait are identified. Although the following discussion of techniques 
for finding the gene or genes associated with a particular trait using linkage disequilibrium mapping, 

35 refers to locating a single gene which is responsible for the trait, it will be appreciated that the same 
techniques may also be used to identify genes which are partially responsible for the trait. 
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Association studies is conducted within the general population (as opposed to the linkage 
analysis techniques discussed above which are limited to studies performed on related individuals in 
one or several affected families). 

Association between a biallelic marker A and a trait T may primarily occur as a result of three 
5 possible relationships between the biallelic marker and the trait. First, allele a of biallelic marker A is 
directly responsible for trait T (e.g., Apo E e4 allele and Alzheimer's disease). However, since the 
majority of the biallelic markers used in genetic mapping studies are selected randomly, they mainly 
map outside of genes. Thus, the likelihood of allele a being a functional mutation directly related to 
trait T is therefore very low. 
10 An association between af biallelic marker A and a traitT may also occur when the biallelic 

i 

marker is very closely linked to the trait locus. In other words, an association occurs when allele a is 
in linkage disequilibrium v^dth the trait-causing allele. When the biallelic marker is in close proximity 
to a gene responsible for the trait, more extensive genetic mapping will ultimately allow a gene to be 
discovered near the marker locus which carries mutations in people with trait T (i.e. the gene 
15 responsible for the trait or one of the genes responsible for the trait). As will be further exemplified 
below using a group of biallelic markers which are in close proximity to the gene responsible for the 
trait, the location of the causal gene can be deduced from the profile of the association curve between 
the biallelic markers and the trait. The causal gene will be found in the vicinity of the marker 
showing the highest association with the trait. 
20 Finally, an association between a biallelic marker and a trait may occur when people with the 

trait and people without the trait correspond to genetically different subsets of the population who, 
coincidentally, also differ in the frequency of allele a (population stratification). This phenomenon is 
avoided by using large heterogeneous samples. 

Association studies are particularly suited to the efficient identification of susceptibility genes 
25 that present icommon polymorphisms, and are involved in multifactorial traits whose frequency is 
relatively higher than that of diseases with monofactorial inheritance. 

A pplication of Linkage Disequilibrium Mapping 
to Candidate Gene Identification 
The general strategy of association studies using a set of biallelic markers, is to scan two 
30 pools of individuals (affected individuals and unaffected controls) characterized by a well defined 
phenotype in order to measure the allele frequencies for a number of the chosen markers in each of 
these pools. If a positive association with a trait is identified using an array of biallelic markers 
having a high enough density, the causal gene will be physically located in the vicinity of the 
associated markers, since the markers showing positive association to the trait are in linkage 
35 disequilibrium with the trait locus. Regions harboring a gene responsible for a particular trait which 
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are identified through association studies using high density sets of biallelic markers will, on average, 
be 20 - 40 times shorter in length than those identified by linkage analysis. 

Once a positive association is confirmed as described above, BACs (bacterial artificial 
chromosomes) obtained from human genomic libraries, constructed as described below, harboring the 
5 markers identified in the association analysis are completely sequenced. 

Once a candidate region has been sequenced and analyzed, the functional sequences within 
the candidate region (exons and promoters, and other potential regulatory regions) are scanned for 
mutations which are responsible for the trait by comparing the sequences of a selected number of 
controls and affected individuals using appropriate software. Candidate mutations are further 
10 confirmed by screening a larger number of affected individuals and'' controls using the 
microsequencing techniques described below. 

Candidate mutations are identified as follows. A pair of oligonucleotide primers is designed 
in order to amplify the sequences of every predicted functional region. PCR amplification of each 
predicted functional sequence is carried out on genomic DNA samples from affected patients and 
15 unaffected controls. Amplification products from genomic PCR are subjected to automated dideoxy 
terminator sequencing reactions and electrophoresed on ABI 377 sequencers. Following gel image 
analysis and DNA sequence extraction, the sequence data are automatically analyzed to detect the 
presence of sequence variations among affected cases and unaffected controls. Sequences are 
systematically verified by comparing the sequences of both DNA strands of each individual. 
20 Polymorphisms are then verified by screening a larger population of affected individuals and 

controls using the microsequencing technique described below in an individual test format. 
Polymorphisms are considered as candidate mutations when present in affected individuals and 
controls at frequencies compatible with the expected association results. 

Association Studies: Statistical Analvsis and Haplotvping 
25 As mentioned above, linkage analysis typically localizes a disease gene to a chromosomal 

region of several megabases. Further refinement in location requires the analysis of additional 
families in order to increase the number of recombinants. However, this approach becomes 
unfeasible because recombination is rarely observed even within large pedigrees (Boehnke, M. 1994, 
Am. J. Hum. Genet. 55: 379-390). 
30 Linkage disequilibrium, the nonrandom association of alleles at linked loci, may offer an 

alternative method of obtaining additional recombinants. When a chromosome carrying a mutant 
allele of a gene responsible for a given trait is first introduced into a population as a result of either 
mutation or migration, the mutant allele necessarily resides on a chromosome having a unique set of 
linked markers (haplotype). Consequently, there is complete disequilibrium between these markers 
35 and the disease mutation: the disease mutation is present only linked to a specific set of marker 
alleles. Through subsequent generations, recombinations occur between the disease mutation and 
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these marker polymorphisms, resulting in a gradual disappearance of disequilibrium. The degree of 
disequilibrium dissipation depends on the recombination frequency, so the markers closest to the 
disease gene will tend to show higher levels of disequilibrium than those that are farther away (Jorde 
LB, 1995. Am. J. Hum. Genet. 56: 11-14). Because linkage disequilibrium patterns in a present-day 
5 population reflect the action of recombination through many past generations, disequilibrium analysis 
effectively increases the sample of recombinants. Thus the mapping resolution achieved through the 
analysis of linkage disequilibrium patterns is much higher than that of linkage analysis. 

In practice, in order to define the regions bearing a candidate gene, the affected and control 
populations are genotyped using an appropriate number of biallelic markers (at a density of 1 marker 

10 every 50-150 kilobases). Then, 'a marker/trait association study is performed that compares the 
genotype frequency of each biallelic marker in the affected and control populations by means of a chi 
square statistical test (one degree of freedom). 

After the first screening, additional markers within the region showing positive association 
are genotyped in the affected and control populations. Two types of complementary analysis are then 

15 performed. First, a marker/trait association study (as described above) is performed to refine the 
location of the gene responsible for the trait. In addition, a haplotype association analysis is 
performed to define the frequency and the type of the ancestral/preferential carrier haplotype. 
Haplotype analysis, by combining the informativeness of a set of biallelic markers increases the 
power of the association analysis, allowing false positive and/or negative data that may result from the 

20 single marker studies to be eliminated. 

The haplotype analysis is performed by estimating the frequencies of all possible haplotypes 
for a given set of biallelic markers in the case and control populations, and comparing these 
frequencies by means of a chi square statistical test (one degree of freedom). Haplotype estimations 
are performed by applying the Expectation-Maximization (EM) algorithm (Excoffier L & Slatkin M, 

25 1995, Mol. Biol. Evol. 12: 921-927), using the EM-HAPLO program (Hawley ME. Pakstis AJ & Kidd 
KK, 1994, Am. J. Phys. Anthropol. 18: 104). The EM algorithm is used to estimate haplotype 
frequencies in the case when only genotype data from unrelated individuals are available. The EM 
algorithm is a generalized iterative maximum likelihood approach to estimation that is useful when 
data are ambiguous and/or incomplete. 

30 The application of biallelic marker based linkage disequilibrium analysis to the 8p23 region to 

identify a gene associated with prostate cancer is described below. 
I.e. Applicati on of Linkage Disequilibrium Mapping to the 8p23 Region 

YAC Con tig Construction in 8p23 Rep -inn 
First, a YAC contig which contains the 8p23 region was constructed as follows. The CEPH- 

35 Genethon YAC map for the entire human genome (Chumakov I.M. et al. A YAC contig map of the 
human genome, Nature, 377 Supp.: 175-297, 1995) was used for detailed contig building in the region 
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around D8S262 and D8S277 genetic markers. Screening data available for regional genetic markers 
D8S1706. D8S277, D8S1742, D8S518, D8S262. D8S1798, D8S1140, D8S561 and D8S1819 were 
used to select the following set of CEPH YACs. localized within this region: 832_g„12. 787_c„ll, 
920_h_7. 807_a_l, 842_b_l, 745_a.3, 910_d_3. 879J„11, 918_c_6, 764_c.7, 910_f_12. 967„c_ll, 
' 5 856„d_8, 792_a_6, 812_h_4. 873_c_8, 930_a_2, 807_a_l, 852_d„10. This set of YACs was tested 
by PGR with the above mentioned genetic markers as well as with other publicly available markers 
supposedly located within the 8p23 region. As a result of these studies, a YAC STS contig map was 
generated around genetic markers D8S262 and D8S277. The two CEPH YACs. 920_h„7 (1170 kb 
insert size) and 910J„12 (1480 kb insert size) constitute a minimal tiling path in this region, with an 
10 estimated size of ca, 2 Megabases. 

During this mapping effort, the following publicly known STS markers were precisely located 
within the contig: WI-14718, WI-3831, D8S1413E, WI-8327, WI-3823, ND4. 

R AC Contig Construction Covering D8S262-D8S277 
Fra ^ent Within 8p23 Region of the Human Genome 
75 Following construction of the YAC contig, a BAC contig was constructed as follows. BAC 

libraries were obtained as described in Woo et al. Nucleic Acids Res., 1994. 22, 4922-4931. Briefly, 
two different whole human genome libraries were produced by cloning BamHI or Hindlll partially 
digested DNA from a lymphoblastoid cell line (derived from individual N°8445, CEPH families) into 
the pBeloBACll vector (Kim et al. Genomics. 1996, 34, 213-218). The library produced with the 
20 BamHI partial digestion contained 110,000 clones with an average insert size of 150 kb, which 
corresponds to 5 human haploid genome equivalents. The library prepared with the HindlH partial 
digestion corresponds to 3 human genome equivalents with an average insert size of 150 kb. 

BAC Screening 

The human genomic BAC libraries obtained as described above were screened with all of the 
25 above mentioned STSs. DNA from the clones in both libraries was isolated and pooled in a three 
dimensional format ready for PCR screening with the above mentioned STSs using high throughput 
PCR methods (Chumakov et al, Nature 1995, 377: 175-298). Briefly, three dimensional pooling 
consists in rearranging the samples to be tested in a manner which allows the number of PCR 
reactions required to screen the clones with STSs to be reduced by at least 100 fold, as compared to 
30 screening each clone individually. PCR amplification products were detected by conventional 
agarose gel electrophoresis combined with automated image capturing and processing. 

In a final step. STS-positive clones were checked individually. Subchromosomal localization 
of BACs was systematically verified by fluorescence in situ hybridization (FISH), performed on 
metaphasic chromosomes as described by Cherif et al. Proc. NaU. Acad. Sci. USA 1990. 87: 6639- 
35 6643. 
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BAG insert size was determined by Pulsed Field Gel Electrophoresis after digestion with 
restriction enzyme Not! 

BAG Contig Analysis 

The ordered BACs selected by STS screening and verified by FISH, were assembled into 
5 contigs and new markers were generated by partial sequencing of insert ends from some of them. 
These markers were used to fill the gaps in the contig of BAG clones covering the chromosomal 
region around D8S277, having an estimated size of 2 megabases. Selected BAG clones from the 
contig were subcloned and sequenced, 

BAG Subcloning 

10 Each BAG human DNA was'Tirst extracted using the alkaline lysis procedure and then 

sheared by sonication. The obtained DNA fragments were end-repaired and electrophoresed on 
preparative agarose gels. The fragments in the desired size range were isolated from the gel, purified 
and ligated to a linearized, dephosphorylated, blunt-ended plasmid cloning vector (pBluescript E Sk 
(+)). Example 1 describes the BAG subcloning procedure. 
75 Example 1 

The cells obtained from three liters overnight culture of each BAG clone were treated by 
alkaline lysis using conventional techniques to obtain the BAG DNA containing the genomic DNA 
inserts. After centrifugation of the BAG DNA in a cesium chloride gradient, ca. SO^ig of BAG DNA 
was purified. 5-lOjig of BAG DNA was sonicated using three distinct conditions, to obtain fragments 
20 of the desired size. The firagments were treated in a 50 \x\ volume with two units of Vent polymerase 
for 20 min at 70°G, in the presence of the four deoxytriphosphates (lOOuM). The resulting blunt- 
ended fragments were separated by electrophoresis on low-melting point 1% agarose gels (60 Volts 
for 3 hours). The fragments were excised from the gel and treated with agarase. After chloroform 
extraction and dialysis on Microcon 100 columns, DNA in solution was adjusted to a 100 ng/^Il 
25 concentration. A ligation was performed overnight by adding 100 ng of BAG fragmented DNA to 20 
ng of pBluescript H Sk (+) vector DNA linearized by enzymatic digestion, and treated by alkaline 
phosphatase. The ligation reaction was performed in a 10 nl final volume in the presence of 40 
units/[il T4 DNA ligase (Epicentre). The ligated products were electroporated into the appropriate 
cells (ElectroMAX E.coli DHIOB cells). IPTG and X-gal were added to the cell mixture, which was 
30 then spread on the surface of an ampicillin-containing agar plate. After overnight incubation at 37°G, 
recombinant (white) colonies were randomly picked and arrayed in 96 well microplates for storage 
and sequencing. 

Partial Sequencing of BAGs 
At least 30 of the obtained BAG clones were sequenced by the end pair-wise method (500 bp 
35 sequence from each end) using a dye-primer cycle sequencing procedure. Pair-wise sequencing was 
performed until a map allowing the relative positioning of selected markers along the corresponding 
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DNA region was estabUshed. Example 2 describes the sequencing and ordering of the BAG inserts. 

Example 2 

The subclone inserts were amplified by PGR on overnight bacterial cultares, using vector 
primers flanking the insertions. The insert extremity sequences (on average 500 bases at each end) 
5 were determined by fluorescent automated sequencing on ABI 377 sequencers, with a ABI Prism 
DNA Sequencing Analysis software (2.1.2 version). 

The sequence fragments from BAG subclones were assembled using Gap4 software from R. 
Staden (Bonfield et al. 1995). This software allows the reconstruction of a single sequence from 
sequence fragments. The sequence deduced from the aUgnment of different fragments is called the 
10 consensus sequence. We used directed sequencing techniques-" (primer walking) to complete 
sequences and Unk contigs. 

Figure 1 shows the overlapping BAG subclones (labeled BAG) which make up the assembled 
contig and the positions of the publicly known STS markers along the contig. 

Tdentification of Biallelic Markers Lvine Alone the BAG Contjg 
75 Following assembly of the BAG contig, biaUelic markers lying along the contig were then 

identified. Given that the assessed distribution of informative biaUelic markers in the human genome 
(biaUelic polymorphisms with a heterozygosity rate higher than 42%) is one in 2.5 to 3 kb, six 500 bp 
genomic fragments have to be screened in order to identify 1 biaUelic marker. Six pairs of primers 
per potential marker, each one defming a ca. 500 bp amplification fragment, were derived from the 
20 above mentioned BAG partial sequences. All primers contained a common upstream oligonucleotide 
tail enabling the easy systematic sequencing of the resulting amplification fragments. Amplification 
of each BAG-derived sequence was carried out on pools of DNA from ca. 100 individuals. The 
conditions used for the polymerase chain reaction were optimized so as to obtain more than 95% of 
PGR products giving 500bp-sequence reads. 
25 The an^lification products from genomic PGR using the oligonucleotides derived from the 

BAG subclones were subjected to automated dideoxy terminator sequencing reactions using a dye- 
primer cycle sequencing protocol. Following gel image analysis and DNA sequence extraction, 
sequence data were automatically processed with appropriate software to assess sequence quality and 
to detect the presence of biaUelic sites among the pooled amplified fragments. BiaUelic sites were 
30 systematically verified by comparing the sequences of both strands of each pool. 

The detection limit for the frequency of biaUeUc polymorphisms detected by sequencing pools 
of 100 individuals is 0.3 +/- 0.05 for the minor allele, as verified by sequencing pools of known allelic 
frequencies. Thus, the biaUelic markers selected by this method wiU be "informative biaUelic 
markers" since tiiey have a frequency of 0.3 to 0.5 for the minor allele and 0.5 to 0.7 for the major 
35 allele, therefore an average heterozygosity rate higher than 42%. 

Example 3 describes the preparation of genomic DNA samples from the individuals screened 
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to identify biallelic markers. 

Example 3 

The population used in order to generate biallelic markers in the region of interest consisted 
of ca. 100 unrelated individuals corresponding to a French heterogeneous population. 
5 DNA was extracted from peripheral venous blood of each donor as follows. 

30 ml of blood were taken in the presence of EDTA. Cells (pellet) were coUected after 
centrifugation for 10 minutes at 2000 rpm. Red cells were lysed by a lysis solution (50 ml final 
volume : 10 mM Tris pH7.6; 5 mM MgCU; 10 mM NaCl). The solution was centrifiiged (10 minutes, 
2000 rpm) as many times as necessary to eliminate the residual red cells present in the supernatant, 
10 after resuspension of the pellet in the lysis solution. 

The pellet of white cells was lysed overnight at 42°C with 3.7 ml of lysis solution composed 

of: 

- 3 ml TE 10-2 (Tris-HCUO mM, EDTA 2 mM) / NaCl 0.4 M 

- 200 nl SDS 10% 

15 - 500 111 K-proteinase (2 mg K-proteinase in TE 10-2 / NaCl 0.4 M). 

For the extraction of proteins, 1 ml saturated NaCl (6M) (1/3.5 v/v) was added. After 
vigorous agitation, the solution was centrifuged for 20 minutes at 10000 rpm. 

For the precipitation of DNA, 2 to 3 volumes of 100% ethanol were added to the previous 
supernatant, and the solution was centrifuged for 30 minutes at 2000 rpm. The DNA solution was 
20 rinsed three times with 70% ethanol to eliminate salts, and centrifuged for 20 minutes at 2000 rpm. 
The pellet was dried at 37''C, and resuspended in 1 ml TE 10-1 or 1 ml water. The DNA 
concentration was evaluated by measuring the OD at 260 nm (1 unit OD = 50 ng/ml DNA). 

To determine the presence of proteins in the DNA solution, the OD 260 / OD 280 ratio was 
determined. Only DNA preparations having a OD 260 / OD 280 ratio between 1.8 and 2 were used in 
25 the subsequent steps described below. 

DNA Amplification 

Once each BAC was isolated, pairs of primers, each one defining a 500 bp-amplification 
fragment, were designed. Each of the primers contained a common oligonucleotide tail upstream of 
the specific :bases targeted for ampUfication, allowing the amplification products from each set of 
30 primers to be sequenced using the common sequence as a sequencing primer. The primers used for 
the genomic amplification of sequences derived from BACs were defined with the OSP software 
(Hillier L. and Green P. Methods Appl., 1991. 1: 124-8). The synthesis of primers was performed 
foUowing the phosphoramidite method, on a GENSET UFPS 24.1 synthesizer. 
Example 4 provides the procedures used in the amplification reactions. 
J5 Example 4 
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The amplification of each sequence was perfonned by PGR (Polymerase Chain Reaction) as 
follows: 

- final volume 50 ^il 

- genomic DNA 1 00 ng 
5 'MgCh 2mM 

- dNTP (each) 200 

- primer (each) 7.5 pmoles 

- Ampli Taq Gold DNA polymerase (Perkin) 1 unit 

- PGR buffer (10X=0.1 M Tris HCl pH 8.3, 0.5 M KCl) IX. 

10 The amplification was performed on a Perkin Elmer 9600 Thermocycler or MJ Research 

PTC200 with heating lid. After heating at 94°C for 10 minutes, 35 cycles were performed. Each 
cycle comprised: 30 sec at 94°C, 1 minute at 55°C, and 30 sec at 72°C. For final elongation, 7 
minutes at 72°C ended the amplification. 

The obtained quantity of amplification products was determined on 96-weIl microtiter plates, 
75 using a fluorimeter and Picogreen as intercalating agent (Molecular Probes). 

The sequences of the amplification products were determined for each of the approximately 
100 individuals from whom genomic DNA was obtained. Those anq)lification products which 
contained biallelic markers were identified. 

Figure 1 shows the locations of the biallelic markers along the 8p23 BAG contig. This first 
20 set of markers corresponds to a medium density map of the candidate locus, with an inter-marker 
distance averaging 50kb-150kb. 

A second set of biallelic markers was then generated as described above in order to provide a 
very high-density map of the region identified using the first set of markers which can be used to 
conduct association studies, as explained below. The high density map has markers spaced on 
25 average every 2-50kb. 

The biallelic markers were then used in association studies as described below. 

Collection of DNA samples from affected and non-affected individuals 
Prostate cancer patients were recruited according to clinical inclusion criteria based on 
pathological or radical prostatectomy records. Control cases included in this study were both 
30 ethnically- and age-matched to the affected cases; they were checked for both the absence of all 
clinical and biological criteria defining the presence or the risk of prostate cancer, and for the absence 
of related familial prostate cancer cases. Both affected and control individuals corresponded to 
unrelated cases. 

The two following pools of independent individuals were used in the association studies. The 
35 first pool, comprising individuals suffering from prostate cancer, contained 185 individuals. Of these 



wo 99/32644 PCT/IB98/02133 

35 

185 cases of prostate cancer. 45 cases were sporadic and 140 cases were familial. The second pool, 
the control pool, contained 104 non-diseased individuals. 

Haplotype analysis was conducted using additional diseased (total samples: 281) and control 
samples (total samples: 130), from individuals recruited according to similar criteria. 
5 Genotvping Affected and Control Individuals 

The general strategy to perform the association studies was to individually scan the DNA 
samples from all individuals in each of the two populations described above in order to establish the 
allele frequencies of the above described biallelic markers in each of these populations. 

Allelic frequencies of the above-described biallelic markers in each population were 
10 determined by performing microsequencing reactions on amplified fragments obtained by genomic 
PGR performed on the DNA samples from each individual. 

DNA samples and amplification products from genomic PGR were obtained in similar 
conditions as those described above for the generation of biallelic markers, and subjected to 
automated microsequencing reactions using fluorescent ddNTPs (specific fluorescence for each 
15 ddNTP) and the appropriate oligonucleotide microsequencing primers which hybridized just upstream 
of the polymorphic base. Once specifically extended at the 3' end by a DNA polymerase using the 
complementary fluorescent dideoxynucleotide analog (thermal cycling), the primer was precipitated 
to remove the unincorporated fluorescent ddNTPs. The reaction products were analyzed by 
electrophoresis on ABI 377 sequencing machines. 
20 Example 5 describes one microsequencing procedure. 

Example 5 

5 |il of PGR products in a microtiter plate were added to 5 ^il purification mix {2U SAP 
(Amersham) ; 2U Exonuclease I (Amersham) ; 1 ^l SAPlOX buffer : 400mM Tris-HGl pH8, 100 mM 
MgC12 ; H20 final volume 5 |il}. The reaction mixture was incubated 30 minutes at 37°C, and 
25 denatured 10 minutes at 94^C. After 10 sec centrifiigation, the microsequencing reaction was 
performed on line with the whole purified reaction mixture (10 [xl) in the microplate using 10 pmol 
microsequencing oligonucleotide (23mers, GENSET, crude synthesis, 5 OD), 0.5 U Thermosequenase 
(Amersham), 1.25 |il Thermosequenase 16X buffer (Amersham), both of the fluorescent ddNTPs 
(Perkin Elmer) corresponding to the polymorphism {0.025 \x\ ddTTP and ddCTP, 0.05 ^1 ddATP and 

30 ddGTP), H20 to a final volume of 20 ^1. A PGR program on a Gene Amp 9600 thermocycler was 
carried out as follows: 4 minutes at 94°G ; 5 sec at 55°G / 10 sec at 94°G for 20 cycles. The reaction 
product was incubated at 4°G until precipitation. The microtiter plate was centrifuged 10 sec at 1500 
rpm. 19 |il MgG12 2nM and 55 |il 100 % ethanol were added in each well. After 15 minute 
incubation at room temperature, the microtiter plate was centrifuged at 3300 rpm 15 minutes at 4°G. 

35 Supematants were discarded by inverting the microtitre plate on a box folded to proper size and by 
centrifugation at 300 rpm 2 minutes at 4**C afterwards. The microplate was then dried 5 minutes in a 
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vacuum drier. The pellets were resuspended in 2.5 |il fonnamide EDTA loading buffer (0.7^1 of 9 
\igf\xl dextran blue in 25 nM EDTA and 1.8 pi formamide). A 10% polyacrylamide gel / 12 cm / 64 
wells was pre-run for 5 minutes on a 377 ABI 377 sequencer. After 5 minutes denaturation at 100°C, 

0. 8 |il of each microsequencing reaction product was loaded in each well of the gel. After migration 
5 (2 h 30 for 2 microtiter plates of PGR products per gel), the fluorescent signals emitted by the 

incorporated ddNTPs were analyzed on the ABI 377 sequencer using the GENESCAN software 
(Perkin Elmer) .Following gel analysis, data were automatically processed with a software that 
allowed the determination of the alleles of biallelic markers present in each amplified fragment. 

1. D. Initial Association Studies 

10 Association studies were run in two successive steps. In a first step, a rough localization of 

the candidate gene was achieved by determining the frequencies of the biallelic markers of Figure 1 in 
the affected and unaffected populations. The results of this rough localization are shown in Figure 2. 
This analysis indicated that a gene responsible for prostate cancer was located near the biallelic 
marker designated 4-67. 

15 In a second phase of the analysis, the position of the gene responsible for prostate cancer was 

further refined using the very high density set of markers described above. The results of this 
localization are shown in Figure 3. 

As shown in Figure 3, the second phase of the analysis confirmed that the gene responsible 
for prostate cancer was near the biallelic marker designated 4-67, most probably within a ca. 150kb 

20 region comprising the marker. 

Haplotvpe analvsis 

The allelic frequencies of each of the alleles of biallelic markers 99-123, 4-26, 4-14, 4-77, 99- 
217, 4-67, 99-213, 99-221, and 99-135 (SEQ ID NOs: 21-38) were determined in the affected and 
unaffected populations. Table 1 lists the internal identification numbers of the markers used in the 
25 haplotype analysis (SEQ ID NOs: 21-38), the alleles of each marker, the most frequent allele in both 
unaffected individuals and individuals suffering from prostate cancer, the least frequent allele in both 
unaffected individuals and individuals suffering from prostate cancer, and the frequencies of these 
alleles in each population. 

Among all the theoretical potential different haplotypes based on 2 to 9 markers, 11 
30 haplotypes showing a strong association with prostate cancer were selected. The results of these 
haplotype analyses are shown in Figure 4. 

Figures 2, 3, and 4 aggregate linkage analysis results with sequencing results which permitted 
the physical order and/or the distance between markers to be estimated. 

The significance of the values obtained in Figure 4 are underscored by the following results 
35 of computer simulations. For the computer simulations, the data from the affected individuals and the 
unaffected controls were pooled and randomly allocated to two groups which contained the same 
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number of individuals as the affected and unaffected groups used to compile the data summarized in 
Figure 4. A haplotype analysis was run on these artificial groups for the six markers included in 
haplotype 5 of Figure 4. This experiment was reiterated 100 times and the results are shown in Figure 
5. Among 100 iterations, only 5% of the obtained haplotypes are present with a p-value below 1X10"^ 
5 as compared to the p-value of 9X10''' for haplotype 5 of Figure 4. Furthermore, for haplotype 5 of 
Figure 4, only 6% of the obtained haplotypes have a significance level below 5X1 0■^ while none of 
them show a significance level below 5X10"^ 

Thus, using the data of Figure 4 and evaluating the associations for single maker alleles or for 
haplotypes will permit estimation of the risk a corresponding carrier has to develop prostate cancer. 
1 0 Significance thresholds of relative risks will be adapted to the reference sample population used. 

The diagnostic techniques may employ a variety of methodologies to determine whether a test 
subject has a biallelic marker pattern associated with an increased risk of developing prostate cancer 
or suffers from prostate cancer resulting from a mutant PGl allele. These include any method 
enabling the analysis of individual chromosomes for haplotyping. such as family studies, single sperm 
15 DNA analysis or somatic hybrids. 

In each of these methods, a nucleic acid sample is obtained from the test subject and the 
biallelic marker pattern for one or more of the biallelic markers Usted in Figures 4, 6A and 6B is 
determined. The biallelic markers listed in Figure 6A are those which were used in tiie haplotype 
analysis of Figure 4. The first column of Figure 6A lists the BAG clones in which the biallelic 
20 markers lie. The second column of Figure 6A lists the internal identification number of the marker. 
The third column of Figure 6A lists the sequence identification number for a first allele of the biallelic 
markers. The fourth column of Figure 6A lists the sequence identification number for a second allele 
of the biaUelic markers. For example, the first allele of the biallelic marker 99-123 has the sequence 
of SEQ ID NO:21 and the second allele of the biallelic marker has the sequence of SEQ ID NO: 30. 
25 The fifth column of Figure 6A lists the sequences of upstream primers which is used to 

generate amplification products containing the polymorphic bases of the biallelic markers. The sixth 
column of Figure 6A lists the sequence identification numbers for the upstream primers. 

The seventh column of Figure 6A lists the sequences of downstream primers which is used to 
generate amplification products containing the polymorphic bases of the biallelic markers. The eighth 
30 column of Figure 6A lists the sequence identification numbers for the downstream primers. 

The ninth column of Figure 6A lists the position of the polymorphic base in the amplification 
products generated using the upstream and downstream primers. The tenth column hsts the identities 
of the polymorphic bases found at the polymorphic positions in the biallelic markers. The eleventh 
and twelfth columns list the locations of microsequencmg primers in the biallelic markers which can 
35 be used to determine the identities of the polymorphic bases. 
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In addition to the biallelic markers of SEQ JD NOs: 21-38, other biallelic markers (designated 
99-1482, 4-73, 4-65) have been identified which are closely linked to one or more of the biallelic 
markers of SEQ ID NOs: 21-38, SEQ ID NOs: 57-62, and the PGl gene. These biallelic markers 
include the markers of SEQ ID NOs: 57-62, which are listed in Figure 6B. The columns in Figure 6B 
5 are identical to the corresponding columns in Figure 6 A. SEQ ID NOs: 58, 59, 61, and 62 lie within 
the PGl gene of SEQ ID N0:1 at the positions indicated in the accompanying Sequence Listing. 

Genetic analysis of these additional biallelic markers is performed as follows. Nucleic acid 
samples are obtained from individuals suffering from prostate cancer and unaffected individuals. The 
frequencies at which each of the two alleles occur in the affected and unaffected populations is 
10 determined using the methodologies described above. Association values are calculated to determine 
the correlation between the presence of a particular allele or spectrum of alleles and prostate cancer. 
The markers of SEQ ID NOs: 21-38 may also be included in the analysis used to calculate the risk 
factors. The markers of SEQ ID NOs: 21-38 and SEQ ID NOs: 57-62 is used in diagnostic 
techniques, such as those described below, to determine whether an individual is at risk for 
15 developing prostate cancer or suffers from prostate cancer as a result of a mutation in the PGl gene. 
Example 6 describes methods for determining the biallelic marker pattern. 

Example 6 

A nucleic acid sample is obtained from an individual to be tested for susceptibility to prostate 
cancer or PGl mediated prostate cancer. The nucleic acid sample is an RNA sample or a DNA 
20 sample. 

A PGR amplification is conducted using primer pairs which generate amplification products 
containing the polymorphic nucleotides of one or more biallelic markers associated with prostate 
cancer-related forms of PGl, such as the biallelic markers of SEQ ID NOs: 21-38, SEQ ID NOs: 57- 
62, biallelic markers which are in linkage disequilibrium with the biallelic markers of SEQ ID NOs: 
25 21-38, SEQ ID NOs: 57-62, biallelic markers in linkage disequilibrium with the PGl gene, or 
combinations thereof. In some embodiments, the PGR amplification is conducted using primer pairs 
which generate amplification products containing the polymorphic nucleotides of several biallelic 
markers . For example, in one embodiment, amplification products containing the polymorphic bases 
of several biallelic markers selected from the group consisting of SEQ ID NOs: 21-38, SEQ ID NOs: 

30 57-62, and biallelic markers which are in linkage disequilibrium with the biallelic markers of SEQ ID 
NOs: 21-38, SEQ ID NOs: 57-62 or with the PGl gene is generated. In another embodiment, 
amplification products containing the polymorphic bases of two or more biallelic markers selected 
from the group consisting of SEQ ID NOs: 21-38, SEQ ID NOs: 57-62, and biallelic markers which 
are in linkage disequilibrium with the biallelic markers of SEQ ID NOs: 21-38, SEQ ID NOs: 57-62 

35 or with the PGl gene is generated. In another embodiment, amplification products containing the 
polymorphic bases of five or more biallelic markers selected from the group consisting of SEQ ID 
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NOs; 21-38, SEQ E) NOs: 57-62, and biallelic markers which are in linkage disequilibrium with the 
biallelic markers of SEQ ID NOs: 21-38, SEQ E) NOs: 57-62 or with the PGl gene is generated. In 
another embodiment, amplification products containing the polymorphic bases of more than five of 
the biallelic markers selected from the group consisting of SEQ ID NOs: 21-38, SEQ ID NOs: 57-62, 
5 and biallelic markers which are in linkage disequilibrium with the biallelic markers of SEQ ID NOs: 
21-38, SEQ ID NOs: 57-62 or with the PGl gene is generated. 

For example, the primers used to generate the amplification products may comprise the 
primers listed in Figure 6A or 6B (SEQ ID NOs: 39-56 and SEQ ID NOs: 63-68). Figures 6A and 
Figure 6B provide exemplary primers which is used in the amplification reactions and the identities 
10 and locations of the polymorphic bases in the amplification products which' are produced with the 
exemplary primers. The sequences of each of the alleles of the biallelic markers resulting from 
amplification using the primers in Figures 6A and 6B are listed in the accompanying Sequence Listing 
as SEQ ID NOs:21-38 and 57-62. 

The PGR primers is oligonucleotides of 10, 15, 20 or more bases in length which enable the 
15 amplification of the polymorphic site in the markers. In some embodiments, the amplification product 
produced using these primers is at least 100 bases in length (i.e. 50 nucleotides on each side of the 
polymorphic base). In other embodiments, the amplification product produced using these primers is 
at least 500 bases in length (i.e. 250 nucleotides on each side of the polymorphic base). In still further 
embodiments, the amplification product produced using these primers is at least 1000 bases in length 
20 (i.e. 500 nucleotides on each side of the polymorphic base). 

It will be appreciated that the primers listed in Figure 6A and 6B are merely exemplary and 
that any other set of primers which produce amplification products containing the polymorphic 
nucleotides of one or more of the biallelic markers of SEQ ID NOs. 21-38 and SEQ ID NOs: 57-62 or 
biallelic markers in linkage disequilibrium with the sequences of SEQ ID NOs. 21-38 and SEQ ID 
25 NOs: 57-62 or with the PGl gene, or a combination thereof is used in the diagnostic methods. 

Following the PGR amplification, the identities of the polymorphic bases of one or more of 
the biallelic markers of SEQ ID NOs: 21-38 and SEQ ID NOs: 57-62, or biallelic markers in linkage 
disequilibrium with the sequences of SEQ ED NOs. 21-38 and SEQ ID NOs: 57-62 or with the PGl 
gene, or a combination thereof, are determined. The identities of the polymorphic bases is determined 
30 using the nwcrosequencing procedures described in Exan^le 5 above and the microsequencing 
primers listed as features in the sequences of SEQ ID NOs: 21-38 and SEQ ID NOs: 57-62. It will be 
appreciated that the microsequencing primers listed as features in the sequences of SEQ ID NOs: 21- 
38 and SEQ ID NOs: 57-62 are merely exemplary and that any primer having a 3Nend near the 
polymorphic nucleotide, and preferably immediately adjacent to the polymorphic nucleotide, is used. 
35 Alternatively, the microsequencing analysis is performed as described in Pastinen et al., Genome 
Research 7:606-614 (1997), which is described in more detail below. 
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Alternatively, the PGR product is completely sequenced to determine the identities of the 
polymorphic bases in the biallelic markers, h another method, the identities of the polymorphic bases 
in the biallelic markers is determined by hybridizing the amplification products to microairays 
containing allele specific oligonucleotides specific for the polymorphic bases in the biallelic markers. 
5 The use of microarrays comprising allele specific oligonucleotides is described in more detail below. 

It will be appreciated that the identities of the polymorphic bases in the biallelic markers is 
determined using techniques other than those listed above, such as conventional dot blot analyses. 

Nucleic acids used in the above diagnostic procedures may comprise at least 10 consecutive 
nucleotides in the biallelic markers of SEQ ID NOs: 21-38 and SEQ ID NOs: 57-62 or the sequences 

10 complementary thereto. Alternatively, the nucleic acids used in the above diagnostic procedures may 
comprise at least 15 consecutive nucleotides in the biallelic markers of SEQ ID NOs: 21-38 and SEQ 
ID NOs: 57-62 or the sequences complementary thereto In some embodiments, the nucleic acids used 
in the above diagnostic procedures may comprise at least 20 consecutive nucleotides in the biallelic 
markers of SEQ ID NOs: 21-38 and SEQ ID NOs: 57-62 or the sequences complementary thereto. In 

15 still other embodiments, the nucleic acids used in the above diagnostic procedures may comprise at 
least 30 consecutive nucleotides in the biallelic markers of SEQ ID NOs: 21-38 and SEQ ID NOs: 57- 
62 or the sequences complementary thereto. In further embodiments, the nucleic acids used in the 
above diagnostic procedures may comprise more than 30 consecutive nucleotides in the biallelic 
markers of SEQ ID NOs: 21-38 and SEQ ID NOs: 57-62 or the sequences complementary thereto. In 

20 still fiirther embodiments, the nucleic acids used in the above diagnostic procedures may comprise the 
entire sequence of the biallelic markers of SEQ ID NOs: 21-38 and SEQ ED NOs: 57-62 or the 
sequences complementary thereto. 

LE, Identification and Sequencing, of the PGl Gene, and Localization of the PGl Protein 

The above haplotype analysis indicated that 171kb of genomic DNA between biallelic 
25 markers 4-14 and 99-221 totally or partially contains a gene responsible for prostate cancer. 
Therefore, the protein coding sequences lying within this region were characterized to locate the gene 
associated with prostate cancer. This analysis, described in further detail below, revealed a single 
protein coding sequence in the 171 kb, which was designated as the PGl gene. 

Template DNA for sequencing the PGl gene was obtained as follows. BACs 189E08 and 
30 463F01 were subcloned as previously described Plasmid inserts were first amplified by PGR on PE 
9600 thermocyclers (Perkin-Elmer). using appropriate primers. AmpliTaqGold (Perkin-Elmer), dNTPs 
(Boehringer), buffer and cycling conditions as recommended by the Perkin-Elmer Corporation. 

PGR products were then sequenced using automatic ABI Prism 377 sequencers (Perkin Elmer, 
Applied Biosystems Division, Foster Gity, GA). Sequencing reactions were performed using PE 9600 
35 themiocyclers (Perkin Elmer) with standard dye-primer chemistry and ThermoSequenase (Amersham 
Life Science). The primers were labeled with the JOE, FAM, ROX and TAMRA dyes. The dNTPs and 
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ddNTPs used in the sequencing reactions were purchased from Boehringer. Sequencing buffer, reagent 
concentrations and cycling conditions were as recommended by Amersham. 

FoUowing the sequencing reaction, the samples were precipitated with EtOH, resuspended in 
forraamide loading buffer, and loaded on a standard 4% acrylamide gel. Electrophoresis was performed 
5 for 2.5 hours at 3000V on an ABI 377 sequencer, and the sequence data were collected and analyzed 
using the ABI Prism DNA Sequencing Analysis Software, version 2.1.2. 

The sequence data obtained as described above were transferred to a proprietary database, where 
quality control and validation steps were performed. A proprietary base-caller (*Trace"), working using 
a Unix system autonfiatically flagged suspect peaks, taking into account the shape of the peaks, the inter- 
10 peak resolution, and the noise level. The proprietary base-caller also performed an automatic trimming. 
Any stretch of 25 or fewer bases havinjg more than 4 suspect peaks was considered unreliable and was 
discarded. Sequences corresponding to cloning vector oligonucleotides were automatically removed 
from the sequence. However, the resulting sequence may contain 1 to 5 bases belonging to the vector 
sequences at their 5* end. If needed, these can easily be removed on a case by case basis. 
15 The genomic sequence of the PGl gene is provided in the accompanying Sequence Listing and 

is designated as SEQ ID NO: 1 . 

Potential exons in BAC-derived human genomic sequences were located by homology searches 
on protein, nucleic acid and EST (Expressed Sequence Tags) public databases. Main public databases 
were locally reconstructed. The protein database, NRPU (Non-redundant Protein Unique) is formed by a 
20 non-redundant fusion of the Genpept (Benson D.A. et al., Nucleic Acids Res. 24: 1-5 (1996), Swissprot 
(Bairoch, A. and Apweiler, R, Nucleic Acids Res. 24: 21-25 (1996) and PK/NBRF (George, D.G. et al.. 
Nucleic Acids Res. 24:17-20 (1996) databases. Redundant data were eliminated by using the NRDB 
software (Benson et al., supra) and internal repeats were masked with the XNU software (Benson et al., 
supra). Homologies found using the NRPU database allowed the identification of sequences 
25 conesponding to potential coding exons related to known proteins. 

The EST local database is con^osed by the gbest section (1-9) of GenBank (Benson et al., 
supra), and thus contains all publicly available transcript fragments. Homologies found with this 
database allowed the localization of potentially transcribed regions. 

The local nucleic acid database contained all sections of GenBank and EMBL (Rodriguez- 
30 Tome, P. et al., Nucleic Acids Res. 24: 6-12 (1996) except the EST sections. Redundant data were 
eliminated as previously described. 

Similarity searches in protein or nucleic acid databases were performed using the BLAS 
software (Altschul, S.F. et al., J. Mol. Biol, 215: 403-410 (1990). Alignments were refined using the 
Fasta software, and multiple alignments used Clustal W. Homology thresholds were adjusted for each 
35 analysis based on the length and the complexity of the tested region, as well as on the size of the 
reference database. 
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Potential exon sequences identified as above were used as probes to screen cDNA libraries. 
Extremities of positive clones were sequenced and the sequence stretches were positioned on the 
genomic sequence of SEQ ID NO: 1 . Primers were then designed using the results from these alignments 
in order to enable the PGl cloning procedure described below. 
5 Cloning PGl cDNA 

PGl cDNA was obtained as follows. 4:1 of ethanol suspension containing Img of human 
prostate total RNA (Clontech laboratories, Inc., Palo Alto, USA; catalogue N. 64038-1, lot 7040869) 
was centrifuged, and the resulting pellet was air dried for 30 minutes at room temperature. 

First strand cDNA synthesis was performed using the AdvantageTM RT-for-PCR kit 
10 (Clontech laboratories, Inc., Palo Alto, USA; catalogue N." 'K1402-1), 1:1 of 20mM solution of primer 
PGRT32: Trrrrri rri ' l ' l ' i ' l ' il ' i ' r TGAAAT (SEQ id NO:10) was added to 12.5 :1 of RNA 
solution in water, heated at 74°C for two and a half minutes and rapidly quenched in an ice bath. 10:1 
of 5xRT buffer (50mM Tris-HCl, pH 8.3, 75mM KCl, 3 mM MgC12), 2.5 :1 of dNTP mix (lOmM 
each), 1.25:1 of human recombinant placental RNA inhibitor were mixed with 1 ml of MMLV reverse 
15 transcriptase (200 units). 6.5:1 of this solution were added to RNA-primer mix and incubated at 42°C 
for one hour. 80:1 of water were added and the solution was incubated at 94°C for 5 minutes. 

5:1 of the resulting solution were used in a Long Range PGR reaction with hot start, in 50 :1 
final volume, using 2 units of rtTHXL, 20 pmoVMl of each of GC1.5p.l: 
CTGTCCCTGGTGCTCCACACGTACTC (SEQ ID N0:6) or GC1.5p2 
20 TGGTGCTCCACACGTACTCCATGCGC (SEQ ID NO: 7) and GC1.3p: 
CTTGCCTGCTGGAGACACAGAATTTCGATAGCAC (SEQ ID NO:9) primers with 35 cycles of 
elongation for 6 minutes at 6TC in thermocycler. 

The sequence of the PGl cDNA obtained as described above (SEQ ID NO 3) is provided in the 
accompanying Sequence Listing. Results of Northern blot analysis of prostate mRNAs support the 
25 existence of a major PGl cDNA having a 5-6kb length. 

Characterization of the PGl Gene 
The intron/exon structure of the gene was deduced by aligning the mRNA sequence from the 
cDNA of SEQ ID N0:3 and the genomic DNA sequence of SEQ ID NO: 1, 

The positions of the introns and exons in the PGl genomic DNA are provided in Figures 7 
30 and 8. Figure 7 lists positions of the start and end nucleotides defining each of the at least 8 exons 
Oabeled Exons A-H) in the sequence of SEQ ID NO: 1, the locations and phases of the 5' and 3' 
splice sites in the sequence of SEQ ID NO: 1, the position of the stop codon in the sequence of SEQ 
ID NO: 1, and the position of the polyadenylation site in the sequence of SEQ ID NO: 1. Figure 8 
shows the positions of the exons within the PGl genomic DNA and the PGl mRNA. the location of a 
35 tyrosine phosphatase retro-pseudogene in the PGl genomic DNA. the positions of the coding region 
in the mRNA, and the locations of the polyadenylation signal and polyA stretch in the mRNA. 
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As indicated in Figures 7 and 8, the PGl gene comprises at least 8 exons, and spans more than 
52kb. The first intron contains a tyrosine phosphatase retropseudogene. A G/C rich putative 
promoter region lies between nucleotide 1629 and 1870 of SEQ ID NO: 1. A CCAAT box is present 
at nucleotide 1661 of SEQ ID NO: 1. The promoter region was identified as described in Prestridge, 
5 D.S., Predicting Pol n Promoter Sequences Using Transcription Factor Binding Sites, J. Mol. Biol. 
249:923-932 (1995). 

It is possible that the methionine listed as being the initiating methionine in the PGl protein 
sequence of SEQ ID NO: 4 (based on the cDNA sequence of SEQ ID NO: 3) may actually be 
downstream but in phase with another methionine which acts as the initiating methionine. The 
10 genomic DNA sequence of SEQ ID NO: 1 contains a methionine upstream from the methionine at 
position number 1 of the protein sequence of SEQ ID NO: 4 . If the upstream methionine is in fact 
the authentic initiation site, the sequence of the PGl protein would be that of SEQ ID NO: 5. This 
possibility is investigated by determining the exact position of the 5N end of the PGl mRNA as 
follows. 

15 One way to determine the exact position of the 5N end of the PGl mRNA is to perform a 

5NRACE reaction using the Marathon-Ready human prostate cDNA kit from Clontech (Catalog. No. 

PTl 156-1). For example, the RACE reaction may employ the PGl primers PG15RACE196 

CAATATCTGGACCCCGGTGTAATTCTC (SEQ ID NO: 8) as the first primer. The second primer 

in the RACE reaction is PG15RACE130n having the sequence 
20 GGTCGTCCAGCGCITGGTAGAAG (SEQ ID NO: 2). The sequence analysis of the resulting PCR 

product, or the product obtained with other PGl specific primers, will give the exact sequence of the 

initiation point of the PGl transcript. 

Alternatively, the 5Nsequence of the PGl transcript can be determined by conducting a PCR 

amplification with a series of primers extending from the 5Nend of the presently identified coding 
25 region. In any event, the present invention contemplates use of PGl nucleic acids and/or polypeptides 

coding for or corresponding to either SEQ ID N0:4 or SEQ ID N0:5 or fragments thereof. 

It is also possible that alternative splicing of the PGl gene may result in additional translation 

products not described above. It is also possible that there are sequences upstream or downstream of 

the genomic sequence of SEQ ID NO: 1 which contribute to the translation products of the gene. 
30 Finally, alternative promoters may result in PGlderived transcripts other than those described herein. 

The promoter activity of the region between nucleotides 1629 and 1870 can be verified as 

described below. Alternatively, should this region lack promoter activity, the promoter responsible 

for driving expression of the PGl gene is identified as described below. 

Genomic sequences lying upstream of the PGl gene are cloned into a suitable promoter reporter 
35 vector, such as the pSEAP-Basic, pSEAP-Enhancer, ppgal-Basic, pPgal-Enhancer, or pEGFP-1 

Promoter Reporter vectors available from Clontech. Briefly, each of these promoter reporter vectors 
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include multiple cloning sites positioned upstream of a reporter gene encoding a readily assayable 
protein such as secreted alkaline phosphatase, p galactosidase, or green fluorescent protein. The 
sequences upstream of the PGl coding region are inserted into the cloning sites upstream of the reporter 
gene in both orientations and introduced into an appropriate host cell. The level of reporter protein is 
5 assayed and compared to the level obtained from a vector which lacks an insert in the cloning site. The 
presence of an elevated expression level in the vector containing the insert with respect to the control 
vector indicates the presence of a promoter in the insert. If necessary, the upstream sequences can be 
cloned into vectors which contain an enhancer for augmenting transcription levels from weak promoter 
sequences. A significant level of expression above that observed with the vector lacking an insert 

1 0 indicates that a promoter sequence is present in the inserted upstreatn sequence. 

Promoter sequences within the upstream genomic DNA is further defined by constructing nested 
deletions in the upstream DNA using conventional techniques such as Exonuclease IE digestion. The 
resulting deletion fragments can be inserted into the promoter reporter vector to determine whether the 
deletion has reduced or obliterated promoter activity. In this way, the boundaries of the promoters is 

15 defined. If desired, potential individual regulatory sites within the promoter is identified using site 
directed mutagenesis or linker scanning to obliterate potential transcription factor binding sites within 
the promoter individually or in combination. The effects of these mutations on transcription levels is 
determined by inserting the mutations into the cloning sites in the promoter reporter vectors. 

Sequences within the PGl promoter region which are likely to bind transcription factors is 

20 identified by homology to known transcription factor binding sites or through conventional mutagenesis 
or deletion analyses of reporter plasmids containing the promoter sequence. For example, deletions is 
made in a reporter plasmid contaming the promoter sequence of interest operably linked to an assayable 
reporter gene. The reporter plasmids carrying various deletions within the promoter region are 
transfected into an appropriate host cell and the effects of the deletions on expression levels is assessed. 

25 Transcription factor binding sites within the regions in which deletions reduce expression levels is 
further localized using site directed mutagenesis, Imker scanning analysis, or other techniques familiar to 
those skilled in the art. 

The promoters and other regulatory sequences located upstream of the PGl gene is used to 
design expression vectors capable of directing the expression of an inserted gene in a desired spatial, 
30 temporal, developmental, or quantitative manner. For example, since the PGl promoter is presumably 
active in the prostate, it can be used to constmct expression vectors for directing gene expression in the 
prostate. 

Preferably, in such expression vectors, the PGl promoter is placed near multiple restriction sites 
to facilitate the cloning of an insert encoding a protein for which expression is desired downstream of the 
35 promoter, such that the promoter is able to drive xpression of the inserted gene. The promoter is 
inserted in conventional nucleic acid backbones designed for extrachromosomal replication, int gration 
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into the host chromosomes or transient expression. Suitable backbones for the present expression 
vectors include retroviral backbones, backbones from eukaryotic episomes such as SV40 or Bovine 
Papilloma Virus, backbones from bacterial episomes, or artificial chromosomes. 

Preferably, the expression vectors also include a polyA signal downstream of the multiple 
5 restriction sites for directing the polyadenylation of mRNA transcribed from the gene inserted into the 
expression vector. 

Nucleic acids encoding proteins which interact with sequences in the PGl promoter is identified 
using one-hybrid systems such as those described in the manual accompanying the Matchmaker One- 
Hybrid System kit available from Clontech (Catalog No. K1603-1). Briefly, the Matchmaker One- 

10 hybrid system is used as follows. The target sequence for' Which it is desired to identify bmding proteins 
is cloned upstream of a selectable reporter gene and integrated into the yeast genome. Preferably, 
multiple copies of the target sequences are inserted into the reporter plasmid in tandem. 

A library comprised of ftisions between cDNAs to be evaluated for the ability to bind to the 
promoter and the activation domain of a yeast transcription factor, such as GAL4, is transformed into the 

75 yeast strain containing the integrated reporter sequence. The yeast are plated on selective media to select 
cells expressing the selectable marker linked to the promoter sequence. The colonies which grow on the 
selective media contain genes encoding proteins which bind the target sequence. The inserts in the genes 
encoding the fusion proteins are further characterized by sequencing. In addition, the inserts is inserted 
into expression vectors or in vitro transcription vectors. Binding of the polypeptides encoded by the 

20 inserts to the promoter DNA is confirmed by techniques familiar to those skilled in the art, such as gel 
shift analysis or DNAse protection analysis. 

Analvsis of PGl Protein Sequence 
The PGl cDNA of SEQ ID NO: 3 encodes a 353 amino-acid protein (SEQ ID N0:4). As 
indicated in the accompanying Sequence Listing, a Prosite analysis indicated that the PGl protein has 

25 a leucine zipper motif, a potential glycosylation site, 3 potential casein kinase EL phosphorylation sites, 
a potential cAMP dependent protein kinase phosphorylation site, 2 potential tyrosine kinase 
phosphorylation sites, 4 potential protein kinase C phosphorylation sites, 5 potential N-myristoylation 
sites, 1 potential tyrosine sulfation site, and one potential araidation site. 

A search for membrane associated domains was conducted according to the methods 

30 described in Argos, P. et al., Structural Prediction of Membrane-bound Proteins, Elur. J. Biochem. 
128:565-575 (1982); Klein et al, Biochimica & Biophysica Acta 815:468-476 (1985); and Eisenberg 
et al,, J. MoL Biol. 179:125-142 (1984). The search revealed 5 potential transmembrane domains 
predicted to be integral membrane domains. These results suggest that the PGl protein is likely to be 
membrane-associated and is an integral membrane protein. 

35 A homology search was conducted to identify proteins homologous to the PGl protein. 

Several proteins were identified which share homology with the PGl protein. Figure 9 lists the 
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accession numbers of several proteins which share homology with the PGl protein in three regions 

designated boxl, box2 and box3. 

It will be appreciated that each of the motifs described above is also present in the protein of 

SEQ ID NO: 5, which would be produced if by translation initiation translated from the potential 
5 upstream methionine in the nucleic acid of SEQ ID NO: 1. 

As indicated in Figure 9, a distinctive pattern of homology to box 1, box 2 (SEQ ID NOs: 1 1- 

14) and box 3 (SEQ ID NOs: 15-20) is found amongst acyl glyerol transferases. For example, the 

plsC protein from E. coli ( Accession Number P26647) shares homology with the boxl and box2 

sequences, but not the box 3 sequence, of the PGl protein. The product of this gene transfers acyl 
10 from acyl-coenzymeA to the sn2 position of l-Acyl-sn-glycerol-3-phosphate (lysophosphatidic acid, 

LPA)(Coleman J., Mol Gen Genet. 1992 Mar 1; 232(2): 295-303). 

Boxl and box2 homologies, but not box 3 homologies, are also found in the SLCI gene 

product from baker's yeast (Accession Number P33333) and the mouse gene AB005623. Each of 

these genes are able to complement in vivo mutations in the bacterial plsC gene. (Nagiec MM, Wells 
75 GB, Lester RL, Dickson RC, J. Biol. Chem., 1993 Oct 15; 268(29): 22156-22163, A suppressor gene 

that enables Saccharomyces cerevisiae to grow without making sphingolipids encodes a protein that 

resembles an Escherichia coli fatty acyltransferase; and Kume K, Shimiru T, Biochem. Biophys. Res. 

Commun. 1997, Aug. 28; 237(3): 663-666, cDNA cloning and expression of murine 1-acyl-sn- 

glycerol-3-phosphate acyltransferase). 
20 Recently two different human homologues of the mouse ABG05623 gene, Accession Numbers 

U89336 and U56417 were cloned and found to be localized to human chromosomes 6 and 9 

(Eberhardt. C, Gray, P.W. and TjoelkerJ^.W., J. Biol. Chem. 1997; 272, 20299-20305, Human 

lysophosphatidic acid acyltransferase cDNA cloning, expression, and localization to chromosome 

9q34.3; and West, J., Tompkins, C.K., Balantac, N.. Nudelman, E., Meengs, B., White, T., Bursten, 
25 S., Coleman, J., Kumar. A.. Singer. J.W. and Leung, D.W, DNA Cell Biol. 6. 691-701 (1997), 

Cloning and expression of two human lysophosphatidic acid acyltransferase cDNAs that enhance 

cytokine induced signaling responses in cells). 

The enzymatic acylation of LPA results in 1,2-diacyl-sn-glycerol 3-phosphate, an intermediate 

to the biosynthesis of both glycerophospholipids and triacylglycerol. Several important signaling 
30 messengers participating in the transduction of mitogenic signals, induction of apoptosis, transmission 

of nerve impulses and other cellular responses mediated by membrane bound receptors belong to this 

metabolic pathway. 

LPA itself is a potent regulator of mammalian cell proliferation. In fact, LPA is one of the 
major mitogens found in blood serum. (For a review: Durieux ME. Lynch KR, Trends Pharmacol. 
35 Sci. 1993 Jun; 14(6):249-254, Signaling properties of lysophosphatidic acid. LPA can act as a 
survival factor to inhibit apoptosis of primary cells; and Levine JS. Koh IS, Triaca V, Lieberthal W, 
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Am. J. Physiol. 1997 Oct; 273(4Pt2): F575-F585, Lysophosphatidic acid: a novel growth and survival 
factor for renal proximal tubular cells). This function of LPA is mediated by the lipid kinase 
phosphatidylinositol 3-kinase. 

Phosphatidylinositol and its derivatives present another class of messengers emerging from 
5 the l-acyl-sn-glycerol-3-phosphate acyltransferase pathway. (Toker A. Cantley LC, Nature 1997 Jun 
12; 387(6634): 673-676. Signaling through the lipid products of phosphoinositide-3-OH kinase; 
Martin TF. Curr. Opin. Neurobiol. 1997 Jun; 7(3):331-338, Phosphoinositides as spatial regulators of 
membrane traffic; and Hsuan JJ, et al.. Int. J. Biochem. Cell Biol. 1997 Mar 1"; 29(3): 415-435, 
Growth factor-dependent phosphoinositide signaling). 

10 Cell growth, differentiatiofi and apoptosis can be affected" and modified by enzymes involved 

in this metabolic pathway. Consequently, alteration of this pathway could facilitate cancer cell 
progression. Modulation of the activity of enzymes in this pathway using agents such as enzymatic 
inhibitors could be a way to restore a normal phenotype to cancerous cells. 

Ashagbley A. Saraadder P. Bittman R, Erukulla RK, Byun HS, Arthur G have recently shown 

15 that ether-linked analogue of lysophosphatidic acid: 4-0-hexadecyl-3(S)-0- 
methoxybutanephosphonate can effectively inhibit the proliferation of several human cancerous cell 
lines, including DU145 line of prostate cancer origin. (Anticancer Res 1996 Jul; 16(4A): 1813-1818, 
Synthesis of ether-linked analogues of lysophosphatidate and their effect on the proliferation of 
human epithelial cancer cells in vitro). 

20 Structural differences between the PGl family of cellular proteins and the functionally 

confirmed l-acyl-sn-glycerol-3-phosphate acyltransferase family, evidenced by the existence of a 
different pattern of homology to box3, could point to unique substrate specificity in the phosphoUpid 
metabolic pathway, to specific interaction with other cellular components or to both. 

Further analysis of the function of the PGl gene can be conducted, for example, by 

25 constracting knockout mutations in the yeast homologues of the PGl gene in order to elucidate the 
potential function of this protein family, and to test potential substrate analogs in order to revert the 
malignant phenotype of human prostate cancer cells as described in Section Vm, below. 

Example 7 

Analysis of the Intracell ular Localisation of the PGl Isoforms 
30 To snidy the intraceUular localisation of PGl protein, different isoforms of PGl were cloned 

in the expression vector pEGFP-Nl(Clontech), transfected and expressed in normal (PNT2A) or 
adenocarcinoma (PC3) prostatic cell line. 

First, to generate cDNA inserts, 5' and 3' primers were synthesised allowing to amplify 
different regions of the PGl open reading frame. Respectively, these primers were designed with an 
35 internal EcoRI or BamHI site which allowed the insertion of the amplified product into the EcoRI and 
BamHI sites of the expression vector. The restriction sites were introduced into the primer so that 
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after cloning into pEGFP-Nl, the PGl open reading frame would be fused in frame, to the EGFP open 
reading frame. The translated protein would be a fusion between PGl and EGFP. EGFP being a 
variant form of the GFP protein (Green Fluorescent Protein), it is possible to detect the intracellular 
localisation of the different PGl isoforms by examining the fluorescence emitted by the EGFP fused 
5 protein. 

The different forms that were analysed correspond either to different messengers identified by 
RT-PCR performed on total normal human prostatic RNA or to a truncated form resulting from a non 
sense mutation identified in a tumoural prostatic cell line LnCaP. The different PGl constructions 
were transfected using the lipofectine technique and EGFP expression was examined 20 hours post 
10 transfection. 

Name and description of the different forms transfected are listed below : 

A) PGl includes all the coding exons from exon 1 to 8. 

B) PGl/1-4 corresponds to an altemative messenger which is due to an alternative splicing, 
joining exon 1 to exon 4, and resulting in the absence of exons 2 and 3. 

15 O PGl/1-5 corresponds to an altemative messenger which is due to an altemative splicing, 

joining exon 1 to exon 5, and resulting in the absence of exons 2, 3 and 4. 

PGl/ 1-7 includes exons 1 to 6, and corresponds to the mutated form identified in genomic 
DNA of the prostatic tumoural cell line LNCaP. 
Cloning of the PGl cDNA inserts in the EGFP-Nl expression vector 

20 cDNAs from human prostate were obtained by RT-PCR using the Advantage RT-for-PCR Kit 

(CLONTECH ref K1402-2). First, liil of oUgodT-containing PGl specific primer PGRT32 
TTTirnriTITrmTTTGAAAT (20pmoles) and 11.5 \xl of DEPC treated H2O were added to 
of total mRNA (Ijig) extracted from human prostate (CLONTECH ref 64038-1). The mRNA was 
heat denaturated for 2.5 min at 74°C and then quickly chilled on ice. A mix containing 4\x\ of 5X 

25 buffer, 1^1 of dNTPs (lOmM each), 0.5^1 of recombinant RNase inhibitor (20U) and l|il of MoMuLV 
Reverse Transcriptase (200U) was added to the denaturated mRNA. Reverse transcription was 
performed for 60 min. at 42°C. Enzymes were heat denaturated for 5 min. at 94''C. Then, 80^1 of 
DEPC treated H2O were added to the reaction mix and the cDNA mix was stored at -20°C. Primers 
PG15Eca3 (5' CCTGAATTCCGCCGAGCTGAGAAGATGC 3'), and PG13Bam2 (5' 

30 CCTGGAICCGCTTTAATAGTAACCCACAGGCAG 3') were used for PGR amplification of the 
different PGl cDNAs. A 50|il PGR reaction mix containing 5^1 of the previously prepared prostate 
cDNA mix, 15^1 of 3.3X PGR buffer, 4\xl of dNTPs (2.5mM each), 20pmoles of primer PG15Eco3, 
20pmoles of primer PG13Bam2, liiil of RtthXL enzyme, 2.2^1 Mg(0Ac)2 (Hot Start) was set up and 
amplification was performed for 35 cycles of 30 sec at 94'*C, 10 min. at 72°C, 4 min. at 67°C after an 

35 initial denaturation step of 10 min. at 94°C. Size and integrity of the PGR product was assessed by 
migration on a 1% agarose gel. 2|ig of the amplification product were digested with 2.4 units of 
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EcoRI (PROMEGA ref R601A) and 2.0 units of BaniHI (PROMEGA ref R602A) in 50|il of IX 
Multicore buffer for 2 hours at 37°C. Enzymes were then heat inactivated for 20 niin, at 68°C, DNA 
was phenol/chloroform extracted and ethanol-precipitated and its concentration was estimated by 
migration on a 1% agarose gel. 

5 To prepare the vector, 2^g of pEGFP-Nl vector (CLONTECH ref 6085-1) were digested with 

2.4 units of EcoRI (PROMEGA ref R601 A) and 2.0 units of BamHI (PROMEGA ref R602A) in 50iil 
of IX multicore buffer for 2 hours at 37°C, Enzymes were then heat inactivated for 20 min, at 68°C, 
DNA was phenol/chloroform extracted and ethanol-precipitated and its concentration and integrity 
were estimated by migration on a 1% agarose gel. 20ng of the BamHI and EcoRI digested pEGFP-Nl 

10 vector were added to 50ng of BamHI-EcoRI digested PGl cDNAs. Ligation waS performed over night 
at using O.Sunits of T4 DNA ligase (BOEHRINGER ref 84333623) in a final volume of 20^1 
conUining IX ligase buffer. The ligation reaction mix was desalted by dialysis against water 
(MILLIPORE ref VSWP01300) for 30rain. at room temperature. One fifth of the desalted ligation 
reaction was electroporated in 25|il of competent cells ElectroMAX DHIOB (GIBCO BRL ref 

75 18290-015) using a resistance of 126 Ohms, capacitance of 50nF, and voltage of 2.5KV. Bacteria 
were then incubated in 500^1 of SOB medium for 30min at 37°C. One fifth was plated on LB AGAR 
containing 40^g/nl KANAMYCINE (SIGMA ref K4000) and incubated over night at 37°C. 
Plasmid DNA was prepared from an overnight liquid culture of individual colonies and sequenced. 
Among the different forms identified 3 were used : 

20 A) PGl which includes all the coding exons from exon 1 to 8. 

B) PGl/1-4 which corresponds to an alternative messenger which is due to an alternative 
splicing, joining exon 1 to exon 4, and resulting in the absence of exons 2 and 3. 

C) PGl/1-5 which corresponds to an alternative messenger which is due to an alternative 
splicing, joining exon 1 to exon 5, and resulting in the absence of exons 2, 3 and 4. 

25 D) Vecto r PGl/1-7 : A cDNA insert encoding for a truncated protein was synthesized by PGR 

amplification, using primers PG 1 5Eco3 and PG lmut29Bam (5 ' 
CC TGGATCC CCTCCATCGTCnTCCCTT 3') and vector PGl as a template. The resultmg PGR 
product was cloned following the same protocol as described above. 

Transfection of the PGl expression vectors in human prostate cell lines. 
30 The DNA/lipofectin solution was prepared as followed: 1.5|il of lipofectin (GIBCO BRL ref 

18292-011) was diluted in 100^1 of OPTI-MEM medium (GIBCO BRL ref 31985-018). and 
incubated for 30min. at room temperature before being mixed to 0.5iig of vector diluted in 100)11 of 
OPTI-MEM medium and incubated for 15 min. at room temperature. Cells were inoculated in 
RPMI1640 medium (Gibco BRL ref 61870-010) containing 5% fetal calf serum (Dutscher ref P30- 
35 3302) on slides (NUNC Lab-Tek ref 177402A) and grown at 37°C in 5%C02. Cells reaching 40-60% 
confluency were rinsed with 300^1 OPTI-MEM medium and incubated with the DNA/lipofectin 
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solution for 6 hours at 3TC, The medium containing DNA was replaced by medium supplemented in 
fetal calf serum and cells were incubated for at least 36 hours at 37°C. Slides were rinsed in PBS and 
ceUs were fixed in ethanol, treated with Propidium iodide, and examined with a fluorescence 
microscope using a double-pass filter set for FTTCTPI . 

5 After transfection of PGl and PGl/1-4 in both the normal and tumoural prostatic cell line, 

green fluorescence was detected into and around the nucleus (Figures 10 and 1 1). This result shows 
that the PGl protein is localised in the nucleus and/or the nuclear membrane. Furthermore, it suggests 
that exons 2 and 3 are dispensable for translocation of PGl to the nucleus. In addition, no difference 
in the intracellular localisation of these two forms was detected between the tumoral and the normal 

10 prostatic cell line. 

On the contrary, transfection experiments using PGl/1-5 show that this form is cytoplasmic in 
the normal prostatic cell line PNT2A. It suggests that exon 4 might be important for the regulation of 
the translocation to the nucleus. Interestingly, similar transfection experiments in the tumoral cell line 
PC3 show that PGl/1-5 remains nuclear and or perinuclear (Figure 12). This result shows that there is 

15 an abnormality in the regulation of the intracellular localization of the PGl isoforms in this tumoral 
cell line. Furthermore, it indicates that the normal function of PGl can be altered indirectly in 
prostatic tumors by an abnormality in the regulation of its intracellular location. 

Finally, a non-sense mutation has been identified in the prostatic tumoural cell line LNCaP, in 
exon 6 of PGl (SEQ ID NO: 69). This mutation is responsible for the production of a truncated 

20 protein (SEQ ID NO: 70). To determine the intracellular location of this truncated protein, PGl/1-7 
and PGl were transfected in the normal prostatic cell line PNT2A. Comparison of the fluorescence 
detected in both sets of experiments clearly showed that the truncated form was localised in the 
cytoplasm as the non-truncated protein was located in and/or around the nucleus (Figure 13). This 
result indicates that this mutated PGl is translated in a truncated protein which is unable to reach the 

25 nucleus. It also suggests that exons 7 and 8 may play an important role in the regulation of the 
intracellular localisation of PGl. Furthermore, it supports the previous hypothesis that an altered 
regulation of PGl intracellular localisation might be involved in prostate tumorigenesis. 
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Alternative Splice Species 

Alternative splicing is a common natural tool for the inhibition of function of full length gene 
products. Alternative splicing is known to result in enzyme isoforms, possesing different kinetic 
characteristi9S (pyruvate kinase: Ml and M2 Yamada K, Noguchi T, Biochem J. 1999 Janl;337(Pt 
1):1-11. Estrogen receptor (ER) gene is known to possess variant splicing yelding the deletions of 
exon 3, 5, or 7. The truncated ER protein induced from variant mRNA could mainly be exhibited as a 
repressor through dominant negative effects on normal ER protein (Iwase H, Omoto Y, Iwata H, Hara 
Y, Ando Y, Kobayashi S, Oncology 1998 Dec;55 Suppl Sl:ll-16 )'Yu et al ( Yu JJ, Mu C, Dabholkar 
M, Guo Y, Bostick-Bruton F, Reed E,Int J Mol Med 1998 Mar;l(3):617-620 ) demonstrated that there 
is an association between alternative splicing of ERCCl, and reduction in cellular capability to repair 
cisplatin-DNA adduct. Munoz-Sanjuan et al (Munoz-Sanjuan I. Simandl BK, Fallon JF. Nathans J, 
Development 1998 Dec 14;126(Pt 2):409-421) demonstrated existence of two differentially spliced 
isoforms of fibroblast growth factor(FGF) type two genes that are present in non-overiapping spatial 
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distributions in the neural tube and adjacent structures in developing chiken embryo. One of these 
forms is secreted and activates the expression of HoxDlS, HoxDll, Fgf-4 and BMP-2 ectopically, 
consistent with cFHF-2 playing a role in anterior-posterior patterning of the limb. 

TheCD44 is a cell adhesion molecule that is present as numerous isoforms created by mRNA 
5 alternative splicing. Expression of variant isoforms of CD44 is associated with tumor growth and 
metastasis.(Shibuya Y, Okabayashi T, Oda K, Tanaka NJpn J Clin Oncol 1998 Oct;28(10):609-14) 
they showed that ratio of two particular isoforms is a useful indicator of prognosis in gastric and 
colorectal carcmoma. Zhang YF et al (Zhang YF, Jeffery S, Burchill SA, Berry PA, Kaski JC, Carter 
ND. Br J Cancer 1998 Nov;78(9): 1141-6 showed that human endothelin receptor A is the subject to 

10 alternative spicing giving at least two isoforms. Thfe truncated receptor was expressed in all tissues 
and cells ex4mined, but the level of expression varied. In melanoma cell lines and melanoma tissues, 
the truncated receptor gene was the major species, whereas the wild-type ETA was predominant in 
other tissues. Zhang et al. conclude that the function and biological significance of this truncated 
ETA receptor is not clear, but it may have regulatory roles for cell responses to ETs. 

15 Example 8 

Identification of PGl Alternative Splice Species 
The PGl cDNA was first cloned by screening of a human prostate cDNA library. Sequence 
analysis of about 400 cDNA clones showed that at least 14 isoforms were present in this cDNA 
library. Comparison of their sequences to the genomic sequence showed that these isoforms resulted 

20 from a complex set of different alternative splicing events between numerous exons (Figure 14). 

To rule out the possibility of a cloning artefact generated during the cDNA library 
construction, and to systematically identify all existing alternative splice junctions, RT-PCR 
experiments I were performed on RNA of normal prostate as well as normal prostatic cell lines 
PNTIA, PNTIB and PNT2 using all the possible combinations of primers specific to the different 

25 exon border^ SEQ ID NOs: 137-178. The presence of multiple PGR bands in each reaction was 
assessed by migration in an agarose gel. Each band was analysed by sequencing, and the presence or 
absence of |pecific splicing events, as seen in the sequence by a specific splice junction, was scored 
as plus or minus in Figure 15. 

Furthermore, to identify aberrant splicing event in prostate tumors, similar experiments were 

50 performed on RNA extracted from tumoral prostatic cell lines LnCaP (obtained from two different 
sources and named FCG and JMB), CaHPV, Dul45 and PC3 as well as on RNA obtained from 
prostate tumors (ECP5 to ECP24). 

As shown in the first five columns, all isoforms identified in die cDNA library were detected 
in RNA of normal prostate, normal prostatic cell lines or prostate tumors. In addition to the different 

35 splice junctions detected in the cDNA library, 19 other splice junctions were detected in normal 
prostate or in normal prostatic cell lines. Two types of exon junctions (exons 3-7, exons 3b-8) were 
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never detected in either normal prostate, normal prostatic cell lines, prostate tumors or prostatic 
tumoral cell lines. Comparison between normal and tumoral samples showed the presence of 2 
additional exon junctions ( exons 3-8, exons 5-8) in the tumoral samples that were not detected 
previously in the normal samples. This result demonstrate that during tumorigenesis, the complex 
5 regulation ojf the PGl splicing has been altered, resulting in an abnormal ratio of the different 
isoforms. It is of a specific interest since it has been shown in patients with a genetic predisposition to 
Wilms tumop that an imbalance between different RNA isoforms might be involved in tumorigenesis 
(Bickmore et al., Science 1992, 257:325-7; Little et al, Hum Mol Genet 1995, 4:351-8). 

Interestingly, comparison between normal and tumoral samples, also showed that some exon 
10 junctions are present in all normal sainples, but are absent in nunndSrous tumoral samples. It further 
indicates that the normal function of PGl can be altered by an abnormality in the regulation of PGl 
splicing and further support the previous hypothesis. 

Furthermore, comparison between the different types of normal samples (Col.2 versus Col. 3, 
4 and 5) also showed differences in the presence or absence of some exon junctions. It indicates that 
15 the transformation process necessary to the generation of these normal prostatic cell lines might result 
in similar alteration which further support the previous hypothesis. 

Example 9 

Determining the Tumor Suppressor Activity of the PGl Gene Product. Mutants and Other PGl 

Polypeptides 

20 PGlj variants which results from either alternate splicing of the PGl mRNA or from mutation 

of PGl that introduce a stop codon (nucleotide of SEQ ID NO: 69 and protein of SEQ ID NO: 70) can 
no longer perform its role of tumor suppressor. It is possible and even likely that PGl tumor 
suppressor role extends beyond prostate cancer to other form of malignancies. PGl therefore 
represent a prime candidate for gene therapy of cancer by creating a targeting vector which knocks out 

25 the mutant and/or introduces a wild-type PGl gene (e.g. SEQ ID NO 3 or 179) or a fragment thereof. 

To validate this model, PGl and its alternatively spliced or mutated variants are stably 
transfected in tumor cell line using methods described in Section VEI, The efficiency of transfection 
is determined by northern and western blotting; the latter is performed using antibodies prepared 
against PGl synthetic peptides designed to distinguish the product of the most abundant PGl mRNA 

30 from the alternatively spliced variants, the truncated variant, or other functional mutants. The 
production of synthetic peptides and of polyclonal antibodies is performed using the methods 
described herein in Sections III and Vn.. After demonstrating that PGl and its variant are efficiendy 
expressed in! various tumor cell line preferably derived from human prostate cancer, hepatocarcinoma, 
lung and colon carcinoma; we the effect of this gene on the rate of cell division, DNA synthesis, 

35 ability to grow in soft agar and ability to induce tumor progression and metastasis when injected in 
immunologically deficient nude mice are determined. 
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Alternatively the PGl gene and its variant are inserted in adenoviruses that are used to obtain 
a high level! of expression of these genes. This method is preferred to test the effect of PGl 
expression in aninial that are spontaneously developing tumor. The production of specific 
adenoviruses is obtained using methods familiar to those with normal skills in cell and molecular 
5 biology. 

n> POLYNUCLEOXroES: 

The present invention encompasses polynucleotides in the form of PGl genomic or cDNA as 
well as polynucleotides for use as primers and probes in the methods of the invention. These 
polynucleotides may consist of, consist essentially of, or comprise a contiguous span of nucleotides of 

10 a sequence from any sequence in the Sequence Lifting as well as sequences which are complementary 
thereto ("complements thereof). Preferably said sequence is selected from SEQ ID NOs: 3, 112-125. 
179, 182-184, The "contiguous span" is at least 6, 8, 10, 12. 15, 20, 25. 30. 50, 100, 200. or 500 
nucleotides in length. It should be noted that the polynucleotides of the present invention are not 
limited to having the exact flanking sequences surrounding the polymorphic bases which are 

15 enumerated :in Sequence Listing. Rather, it will be appreciated that the flanking sequences 
surroundmg jthe biallelic markers, or any of the primers of probes of the invention which are more 
distant from; a biallelic markers, is lengthened or shortened to any extent compatible with their 
intended use and the present invention specifically contemplates such sequences. It will be 
appreciated that the polynucleotides referred to in the Sequence Listing is of any length compatible 

20 with their intended use. Also the flanking regions outside of the contiguous span need not be 
homologous to native flanking sequences which actually occur in humans. The addition of any 
nucleotide sequence, which is compatible with the nucleotides intended use is specifically 
contemplated. The contiguous span may optionally include the PGl-related biallelic marker in said 
sequence. Optionally either allele of the biallelic markers described above in the definition of PGl- 

25 related biallelic marker is specified as being present at the PGl-related biallelic marker. 

The invention also relates to polynucleotides that hybridize, under conditions of high or 
intermediate stringency, to a polynucleotide of a sequence from any sequence in the Sequence Listing 
as well as sequences, which are complementary thereto. Preferably said sequence is selected from 
SEQ ID NOs: 3, 112-125, 179. 182-184. Preferably such polynucleotides is at least 6, 8, 10, 12. 15, 

30 20, 25, 30, 3i5, 40, 50, 60, 70, 80, 90. 100, 200, or 500 nucleotides in length. Preferred polynucleotides 
comprise anIPGl-related biallelic marker. Optionally either allele of the biallelic markers described 
above in the definition of PGl-related biallelic marker is specified as being present at the biallelic 
marker site. Conditions of high and intermediate stringency are further described in Section X.C.4, 
below. 

35 The invention embodies polynucleotides which encode an entire human, mouse or 

mammalian PGl protein, or fragments thereof. Generally the polynucleotides of the invention 
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comprise the naturally occurring nucleotide sequence of the PGl. However, any naturally occurring 
silent codon A^ariation or other silent codon variation can be employed to encode the PGl amino acids 
sequence. As for those amino acids which are changed or added to the PGl gene for any embodiment 
of the invention which requires the expression of a nucleotide sequence, the nucleic acid sequences 

5 generally will be chosen to optimize expression in the specific human or non-human animal system in 
which the polynucleotide is intended to be used, making use of known codon preferences. The PGl 
polynucleotides of the invention can be the native nucleotide sequence which encodes a human, 
mouse, or mammalian PGl protein, preferably the PGl polynucleotide sequence of SEQ ID NOs: 3, 
112-125, 179, 182-184, and the compliments thereof. The polynucleotides of the invention include 

10 those which encode PGl polypeptides with a cbhtiguous stretch of at least 8,^10, 12, 15, 20, 25, 30, 
50, 100 or 200 amino acids from SEQ ID NOs: 4, 5, 70, 74, and 125-136, as well as any other human, 
mouse or mammalian PGl polypeptide. In addition the present invention encompasses 
polynucleotides which comprise a contiguous stretch of at least 8, 10, 12, 15, 20, 25, 30. 50, 100, 200, 
500 nucleotides of a human, mouse or mammalian PGl genomic sequence as well as complete human, 

15 mouse, or mammalian PGl genes, preferably of SEQ ID NOs: 179, 182, 183, and the compliments 
thereof. 

The present invention encompasses polynucleotides which consist of, consist essentially of, 
or comprise a contiguous stretch of at least 8, 10, 12, 15, 20, 25, 30, 50, 100, 200, or 500 nucleotides 
of a human, mouse or manunalian PGl cDNA sequences as well as an entire human, mouse, or 
20 manunalian PGl cDNA. The cDNA species and polynucleotide fragments comprised by the 
polynucleotides of the invention include the predominant species derived from any human, mouse or 
raanmial source, preferably SEQ ID NOs: 3, 184, and the compliments thereof. In addition, the 
polynucleotides of the invention comprise cDNA species, and fragments thereof, that result from the 
alternative splicing of PGl transcripts in any human, mouse or other manunal, preferably the cDNA 
25 species of SpQ ID NOs: 112-124. and compliments thereof. Moreover, the invention encompasses 
cDNA species and other polynucleotides which consist of or comprise the polynucleotides which span 
a splice jundtion, preferably including any one of SEQ ID NOs: 137 to 178. and the compliments 
thereof; more preferably any one of SEQ ID NOs: 137 to 149, 151 to 169, 171 to 178, and the 
compliments thereof. The polynucleotides of the invention also include cDNA and other 
30 polynucleotides which comprise two covalently linked PGl exons, derived from a single human, 
mouse or mammalian species, immediately adjacent to one another in the order shown, and selected 
from the following pairs of PGl exons: 1:2. 1:3, 1:4, 1:5, 1:6, 1:7, 1:8, 2:3, 2:4, 2:5, 2:6. 2:7, 2:8, 3:4, 
3:5, 3:6, 3:7. 3:8, 4:5. 4:6, 4:7. 4:8, 5:6, 5:7, 5:8, 6:7, 6:8, 7:8, l:lbis, lbis:2, lbis:3, lbis:4, lbis:5, 
lbis:6, lbis:7, lbis:8, 3:3bis, 3bis:4, 3bis:5. 3bis:6, 3bis:7, 3bis:8, 5:5bis, 5bis:6, 5bis:7, 5bis:8, 
35 l:6bis, 2:6bis, 3:6bis, 4:6bis. 5:6bis, 6bis:7, 6bis:8, and die compliments thereof. In a preferred 
embodiment the sequences of the PGl exons in each of the pairs of exons is selected as follows: 
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exon 1 - SEQ ID NO: 100; exon 2 - SEQ ID NO: 101; exon 3 - SEQ ID NO: 102; 
exon 4 - SEQ ID NO: 103; exon 5 - SEQ ID NO: 104; exon 6 - SEQ ID NO: 105; 
exon 7 - SEQ ID NO: 106; exon 8 - SEQ ID NO: 107; exon Ibis - SEQ ID NO: 108; 
exon 3bis - SEQ ID NO: 109; exon 5bis - SEQ ID NO: 1 10; and 

exon 6bis - SEQ ID NO: 111. Because of the 8 different polyadenylation sites in exon 8, any cDNA 
or polynucleotide of the invention comprising a human cDNA fragment encompassing exon 8 is 
truncated such that only the first 330 nucleotides, 699 nucleotides, 833 nucleotides. 1826 nucleotides, 
2485 nucleotides, 2805 nucleotides, 4269 nucleotides or 4315 nucleotides of exon 8 shown in SEQ ID 
NO: 107 areipresent. 

The primers of the present invention is designed froth the disclosed sequences for any method 
known in the art. A preferred set of primers is fashioned such that the 3' end of the contiguous span 
of identity with the sequences of the Sequence Listing is present at the 3' end of the primer. Such a 
configuration allows the 3* end of the primer to hybridize to a selected nucleic acid sequence and 
dramatically increases the efficiency of the primer for amplification or sequencing reactions. Allele 
specific primers is designed such that a biallelic marker is at the 3' end of the contiguous span and the 
contiguous span is present at the 3' end of the primer. Such allele specific primers tend to selectively 
prime an amplification or sequencing reaction so long as they are used with a nucleic acid sample that 
contains one of the two alleles present at a biallelic marker. The 3' end of primer of the invention is 
located within or at least 2, 4, 6. 8, 10, 12, 15, 18, 20, 25, 50, 100, 250, 500, or 1000 nucleotides 
upstream of an PGl-related biallelic marker in said sequence or at any other location which is 
appropriate for their intended use in sequencing, amplification or the location of novel sequences or 
markers. 

Preferred amplification primers include the polynucleotides disclosed in SEQ ID NOs: 39-56, 
and 63-68. Additional preferred amplification primers for particular non-genic PGl-related biallelic 
markers are listed as follows by the internal reference number for the marker and the SEQ ID NOs for 
the PU and RP amplification primers respectively: 

4-14-107 use SEQ ID NOs 339 and 382; 4-14-317 use SEQ ID NOs 339 and 382; 

4-14-35 use SEQ ID NOs 339 and 382; 4-20-149 use SEQ ID NOs 340 and 383; 

4-22-174 use SEQ ID NOs 341 and 384; 4-22-176 use SEQ ID NOs 341 and 384; 

4-26-60 use SEQ ID NOs 342 and 385; 4-26-72 use SEQ ID NOs 342 and 385; 

4-3-130 use SEQ ID NOs 343 and 386; 4-38-63 use SEQ ID NOs 344 and 387; 

4-38-83 use SEQ ID NOs 344 and 387; 4-4-152 use SEQ ID NOs 345 and 388; 4-4-187 use 

SEQ ID NOs 345 and 388; 4-4-288 use SEQ ID NOs 345 and 388; 

4-42-304 use SEQ ID NOs 346 and 389; 4-42-401 use SEQ ID NOs 346 and 389; 

4-43-328 use SEQ ID NOs 347 and 390; 4-43-70 use SEQ ID NOs 347 and 390; 

4-50-209 ust SEQ ID NOs 348 and 391; 4-50-293 use SEQ ID NOs 348 and 391; 
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4-50-323 use SEQ ID NOs 348 and 391; 4-50-329 use SEQ ID NOs 348 and 391; 
4-50-330 use SEQ ID NOs 348 and 391; 4-52-163 use SEQ ID NOs 349 and 392; 
4-52-88 use SEQ ID NOs 349 and 392; 4-53-258 use SEQ ID NOs 350 and 393; 

4-54-283 usi SEQ ID NOs 351 and 394; 4-54-388 use SEQ ID NOs 351 and 394; 

i 

5 4-55-70 use ^EQ ID NOs 352 and 395 ; 4-55-95 use SEQ ID NOs 352 and 395 ; 

4-56-159 ns6 SEQ ID NOs 353 and 396; 4-56-213 use SEQ ID NOs 353 and 396; 

4-58-289 usd SEQ ID NOs 354 and 397; 4-58-318 use SEQ ID NOs 354 and 397; 

4-60-266 us4 SEQ ID NOs 355 and 398; 4-60-293 use SEQ ID NOs 355 and 398; 

4-84-241 use SEQ ID NOs 356 and 399; 4-84-262 use SEQ ID NOs 356 and 399; 
10 4-86-206 use SEQ ID NOs 357 and 400; 4^86-309 use SEQ ID NOs 357 aiid 400; ^ 

4-88-349 use SEQ ID NOs 358 and 401; 4-89-87 use SEQ ID NOs 359 and 402; 

99-123-184 use SEQ ID NOs 360 and 403; 99-128-202 use SEQ ID NOs 361 and 404; 99-128- 

275 use SEQ ID NOs 361 and 404; 99-128-3 13 use SEQ ID NOs 361 and 404; 

99-128-60 use SEQ ID NOs 361 and 404; 99-12907-295 use SEQ ID NOs 362 and 405; 99- 
15 130-58 use SEQ ID NOs 363 and 406; 99-134-362 use SEQ ID NOs 364 and 407 ; 99-140-130 

use SEQ ID NOs 365 and 408; 99-1462-238 use SEQ ID NOs 366 and 409; 99-147-181 use 

SEQ ID NOs 367 and 410; 99-1474-156 use SEQ ID NOs 368 and 411; 99-1474-359 use SEQ 

ID NOs 368 land 411; 

i 

99-1479-158 use SEQ ID NOs 369 and 412; 
20 99-I479-379I use SEQ ID NOs 369 and 412; 99-148-129 use SEQ ID NOs 370 and 413; 

1 

99-148-132 use SEQ ID NOs 370 and 413; 99-148-139 use SEQ ID NOs 370 and 413; 

99-148-140 ise SEQ ID NOs 370 and 413; 99-148-182 use SEQ ID NOs 370 and 413; 

99-148-366 use SEQ ID NOs 370 and 413; 99-148-76 use SEQ ID NOs 370 and 413; 99-1480- 

290 use SEQ ID NOs 371 and 414; 
25 99-1481-285 use SEQ ID NOs 372 and 415; 

99-1484-101 use SEQ ID NOs 373 and 416; 

99-1484-328 use SEQ ID NOs 373 and 416; 

99-1485-251 use SEQ ED NOs 374 and 417; 

99-1490-381 use SEQ ID NOs 375 and 418; 
30 99-1493-280 use SEQ ID NOs 376 and 419; 99-15 1-94 use SEQ ED NOs 377 and 420; 

99-211-291 use SEQ ID NOs 378 and 421; 99-213-37 use SEQ ID NOs 379 and 422; 

99-221-442 juse SEQ ID NOs 380 and 423; 99-222-109 use SEQ ID NOs 381 and 424; and the 

compliment^ thereof. 

Primers with their 3' ends located 1 nucleotide upstream or downstream of a PGl -related 
35 biallelic maJker have a special utility in microsequencing assays. Preferred microsequencing primers 
include the polynucleotides from position 1 to position 23 and from position 25 to position 47 of SEQ 
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ID NOs: 21r38, and as well as the compliments thereof. Additional preferred microsequencing 

1 

primers for particular non-genic PGl-related biallelic markers are listed as follows by the internal 

! 

reference number for the noarker and the SEQ ID NOs of the two preferred microsequencing primers: 

4-14-107 of SEQ ID NOs 425 and 502*; 4-14-317 of SEQ ID NOs 426 and 503*; 
5 4-14-35 of SEQ ID NOs 427 and 504*; 4-20-149 of SEQ ED NOs 428* and 505; 

4-20-77 of SEQ ID NOs 429 and 506; 4-22-174 of SEQ ID NOs 430* and 507; 

4-22-176 of SEQ ID NOs 431 and 508; 4-26-60 of SEQ ID NOs 432 and 509*; 

4-26-72 of SEQ ID NOs 433 and 5 10; 4-3-130 of SEQ ID NOs 434 and 5 11*; 

4-38-63 of SEQ ID NOs 435 and 512; 4-38-83 of SEQ ID NOs 436 and 513*; 
10 4-4-152ofSEQlDNOs437and514;4-4-187ofSEQIDNOs438*and515; 

4-4-288 of SEQ ID NOs 439 and 516; 4-42-304 of SEQ ID NOs 440 and 517; 

4-42-401 of SEQ ID NOs 441* and 518; 4-43-328 of SEQ ID NOs 442 and 519; 

4-43-70 of SEQ ID NOs 443* and 520; 4-50-209 of SEQ ID NOs 444* and 521 ; 

4-50-293 of SEQ ID NOs 445* and 522; 4-50-323 of SEQ ID NOs 446* and 523; 
15 4-50-329 of SEQ ED NOs 447* and 524; 4-50-330 of SEQ ID NOs 448 and 525; 

4-52-163 of SEQ ID NOs 449* and 526; 4-52-88 of SEQ ID NOs 450* and 527; 

4-53-258 of SEQ ID NOs 451 and 528*;4-54-283 of SEQ ID NOs 452* and 529; 

i 

4-54-388 of ^EQ ID NOs 453 and 530; 4-55-70 of SEQ ID NOs 454 and 531 *; 
4-55-95 of SEQ ID NOs 455* and 532; 4-56-159 of SEQ ID NOs 456* and 533; 
20 4-56-213 of SEQ ID NOs 457 and 534; 4-58-289 of SEQ ID NOs 458* and 535; 
4-58-318 of SEQ ID NOs 459* and 536; 4-60-266 of SEQ ID NOs 460* and 537; 
4-60-293 of SEQ ID NOs 461 * and 538; 4-84-241 of SEQ ID NOs 462 and 539*; 
4-84-262 of SEQ ID NOs 463 and 540; 4-86-206 of SEQ ID NOs 464 and 541*; 
4-86-309 of SEQ ID NOs 465 and 542; 4-88-349 of SEQ ID NOs 466 and 543.; 
25 4-89-87 of SEQ ID NOs 467* and 544.; 99-123-184 of SEQ ID NOs 468 and 545; 
99-128-202 of SEQ ID NOs 469 and 546; 99-128-275 of SEQ ID NOs 470 and 547; 
99-128-313 of SEQ ID NOs 471 and 548; 99-128-60 of SEQ ID NOs 472* and 549; 
99-12907-295 of SEQ ID NOs 473 and 550*; 
99-130-58 of SEQ ID NOs 474* and 551*; 
30 99-134-362 of SEQ ID NOs 475 and 552*; 99-140'130 of SEQ ID NOs 476* and 553*; 
99-1462-238 of SEQ ID NOs 477* and 554; 99-147-181 of SEQ ED NOs 478 and 555*; 
99-1474-15^ of SEQ ID NOs 479 and 556*; 99-1474-359 of SEQ ID NOs 480 and 557; 
99-1479-158 of SEQ ID NOs 481* and 558; 99-1479-379 of SEQ ID NOs 482 and 559; 
99-148-129 of SEQ ID NOs 483 and 560; 99-148-132 of SEQ ID NOs 484 and 561; 
35 99-148-139 of SEQ ID NOs 485 and 562; 99-148-140 of SEQ ID NOs 486 and 563; 
99-148-182 of SEQ ID NOs 487 and 564*; 99-148-366 of SEQ ID NOs 488 and 565; 
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99-148-76 of SEQ ID NOs 489 and 566; 99-1480-290 of SEQ ID NOs 490 and 567*; 
99-1481-285 of SEQ ID NOs 491 and 568*; 99-1484-101 of SEQ ID NOs 492 and 569; 
99-1484-328 of SEQ ID NOs 493* and 570; 
99-1485-251 of SEQ ID NOs 494 and 571*; 
5 99-1490-381 of SEQ ID NOs 495* and 572; 
99-1493-280 of SEQ ID NOs 496 and 573*; 

99-151-94 of SEQ ID NOs 497 and 574*; 99-211-291 of SEQ ID NOs 498* and 575; 
99-213-37 of SEQ ID NOs 499 and 576; 99-221-442 of SEQ ID 500 and 577; 
99-222-109 of SEQ ID NOs 501* and 578; and compliments thereof. 
10 Additional preferred microsequencing pnmers for particular genie PGl -related biallelic 

markers incljide a polynucleotide selected from the group consisting of the nucleotide sequences from 
position N-X to position N-1 of SEQ ID NO:179, nucleotide sequences from position N+1 to position 
N+X of SEQ ID NO: 179. and the compliments thereof, wherein X is equal to 15, 18, 20, 25, 30, or a 
range of 15 to 30, and N is equal to one of the following values: 2159; 2443; 4452; 5733; 8438; 
15 11843; 1983; 12080; 12221; 12947; 13147; 13194; 13310; 13342; 13367; 13594; 13680; 13902; 
16231; 16388; 17608; 18034; 18290; 18786; 22835; 22872; 25183; 25192; 25614; 26911; 32703; 
34491; 34756; 34934; 5160; 39897; 40598; 40816; 40947; 45783; 47929; 48206; 48207; 49282; 
50037; 50054; 50101; 50220; 50440; 50562; 50653; 50660; 50745; 50885; 51249; 51333; 51435; 
51468; 51515; 51557; 51566; 51632; 51666; 52016; 52096; 52151; 52282; 52348; 52410; 52580; 
20 52712; 52772; 52860; 53092; 53272; 53389; 53511; 53600; 53665; 53815; 54365; and 54541. 

The probes of the present invention is designed from the disclosed sequences for any method 
known in the art, particularly methods which allow for testing if a particular sequence or marker 
disclosed herein is present. A preferred set of probes is designed for use in the hybridization assays of 
the invention in any manner known in the art such that they selectively bind to one allele of a biallelic 
25 marker, but iiot the other under any particular set of assay conditions. Preferred hybridization probes 
may consist^ of, consist essentially of, or comprise a contiguous span which ranges in length from 8, 
10, 12, 15, 18 or 20 to 25, 35, 40, 50, 60, 70, or 80 nucleotides, or be specified as being 12, 15, 18, 20, 
25, 35, 40, oir 50 nucleotides in length and including a PGl-related biallelic marker of said sequence. 
Optionally either of the two alleles specified in the definition of PGl-realted biallelic marker is 
30 specified as being present at the biallelic marker site. Optionally, said biallelic niarker is within 6, 5, 
4, 3, 2, or 1 nucleotides of the center of the hybridization probe or at the center of said probe. A 
preferred set of hybridization probes is disclosed in SEQ ID NOs: 21-38, 57-62, 185-338, and the 
compliments thereof. Another particularly preferred set of hybridization probes includes the 
polynucleotides from position X to position Y of any one of SEQ ID NOs: 21-38, 57-62, 185-338. or 
35 the compliments thereof, wherein X is equal to 5, 8. 10, 12, 14, 16, 18 or a range of 5 to 18, and Y is 
equal to 30, 32, 34, 36, 38, 40, 43 or a range of 30 to 43; preferably X equals 12 and Y equals 36. 
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Additional preferred hybridization probes for particular genie PGl -related biallelic markers include a 
polynucleotide selected from the group consisting of the nucleotide sequences from position N-X to 
position N+Y of SEQ ID NO: 179, and the compliments thereof, wherein X is equal to 8, 10, 12, 15, 
20, 25, or a range of 8 to 30, Y is equal to 8, 10, 12, 15, 20, 25, or a range of 8 to 30, and N is equal to 
5 one of the following values: 2159; 2443; 4452; 5733; 8438; 11843; 1983; 12080; 12221; 12947; 
13147; 13194; 13310; 13342; 13367; 13594; 13680; 13902; 16231; 16388; 17608; 18034; 18290; 
18786; 2282j5; 22872; 25183 ; 25192; 25614; 26911; 32703; 34491; 34756; 34934; 5160; 39897; 
40598; 40816; 40947; 45783; 47929; 48206; 48207; 49282; 50037; 50054; 50101; 50220; 50440; 
50562; 5065:3; 50660; 50745; 50885; 51249; 51333; 51435; 51468; 51515; 51557; 51566; 51632; 
10 51666; 52016; 52096; 52151; 52282; 52348; 52410; 52580; 52712; 52772; 52860; 53092; 53272; 
53389; 53511; 53600; 53665; 53815; 54365; and 54541; wherein the nucleotide at position N is 
selected from one of the two alleles specified in the definition of PGl-realted biallelic marker at the 
biallelic marker site at position N. 

Any of the polynucleotides of the present invention can be labeled, if desired, by 
15 incorporating a label detectable by spectroscopic, photochemical, biochemical, immunochemical, or 
chemical means. For example, useful labels include radioactive substances, fluorescent dyes or 
biotin. Preferably, polynucleotides are labeled at their 3' and 5* ends. A label can also be used to 
capture the primer, so as to facilitate the immobilization of either the primer or a primer extension 
product, such as amplified DNA, on a solid support. A capture label is attached to the primers or 
20 probes and can be a specific binding member which forms a binding pair with the solid's phase 
reagent's specific binding member (e.g. biotin and streptavidin). Therefore depending upon the type 
of label carried by a polynucleotide or a probe, it is employed to capture or to detect the target DNA. 
Further, it will be understood that the polynucleotides, primers or probes provided herein, may, 
themselves, serve as the capture label. For example, in the case where a solid phase reagent's binding 
25 member is a nucleic acid sequence, it is selected such that it binds a conq)lementary portion of a 
primer or probe to thereby immobilize the primer or probe to the solid phase. In cases where a 
polynucleotide probe itself serves as the binding member, those skilled in the art will recognize that 
the probe will contain a sequence or "tail" that is not complementary to the target. In the case where a 
polynucleotide primer itself serves as the capture label, at least a portion of the primer will be free to 
30 hybridize with a nucleic acid on a solid phase. DNA Labeling techniques are well known to the 
skilled technician. 

Any of the polynucleotides, primers and probes of the present invention can be convenientiy 
immobilized on a solid support. Solid supports are known to those skilled in the art and include the 
walls of wells of a reaction tray, test tubes, polystyren beads, magnetic beads, nitrocellulose strips, 
35 membranes, imicroparticles such as latex particles, sheep (or other animal) red blood cells, duracytes® 
and others. iThe solid support is not critical and can be selected by one skilled in the art. Thus, latex 
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particles, microparticles, magnetic or non-magnetic beads, membranes, plastic tubes, walls of 
microtiter wells, glass or silicon chips, sheep (or other suitable animal's) red blood cells and 
duracytes ar^ all suitable examples. Suitable methods for immobilizing nucleic acids on solid phases 

include ionid, hydrophobic, covalent interactions and the like. A solid support, as used herein, refers 

i 

5 to any material which is insoluble, or can be made insoluble by a subsequent reaction. The solid 
support can= be chosen for its intrinsic ability to attract and immobilize the capture reagent. 
Alternatively^, the solid phase can retain an additional receptor which has the ability to attract and 
immobilize the capture reagent. The additional receptor can include a charged substance that is 
oppositely charged with respect to the capture reagent itself or to a charged substance conjugated to 

10 the capture reagbnt. As yet another alternative, the receptor molecule can be any specific binding 
member which is inunobilized upon (attached to) the solid support and which has the ability to 
immobilize the capture reagent through a specific binding reaction. The receptor molecule enables 
the indirect binding of the capture reagent to a solid support material before the performance of the 
assay or during the performance of the assay. The solid phase thus can be a plastic, derivatized 

75 plastic, magnetic or non-magnetic metal, glass or silicon surface of a test tube, microtiter well, sheet, 
bead, microparticle, chip, sheep (or other suitable animars) red blood cells, duracytes® and other 
configurations known to those of ordinary skill in the art. The polynucleotides of the invention can be 
attached to cjr immobilized on a solid support individually or in groups of at least 2, 5, 8, 10, 12, 15, 
20, or 25 distinct polynucleotides of the inventions to a single solid support. In addition, 

20 polynucleotides other than those of the invention may attached to the same solid support as one or 
more polynucleotides of the invention. 

Any polynucleotide provided herein is attached in overlapping areas or at random locations on 
the solid support. Alternatively the polynucleotides of the invention is attached in an ordered array 
wherein each polynucleotide is attached to a distinct region of the solid support which does not 

25 overlap with the attachment site of any other polynucleotide. Preferably, such an ordered array of 
polynucleotides is designed to be "addressable" where the distinct locations are recorded and can be 
accessed as part of an assay procedure. Addressable polynucleotide arrays typically comprise a 
plurality of different oligonucleotide probes that are coupled to a surface of a substrate in different 
known locations. The knowledge of the precise location of each polynucleotides location makes these 

30 "addressable" arrays particularly useful in hybridization assays. Any addressable array technology 

known in the art can be employed with the polynucleotides of the invention. One particular 

i 

embodimenti of these polynucleotide arrays is known as the Genechips"^", and has been generally 
described in! US Patent 5,143,854; PCT publications WO 90/15070 and 92/10092. These arrays may 
generally bejproduced using mechanical synthesis methods or light directed synthesis methods, which 
35 incorporate a combination of photolithographic methods and solid phase oligonucleotide synthesis 
(Fodor et al,^ Science, 251:767-777, 1991). The immobilization of arrays of oligonucleotides on solid 
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supports hasibeen rendered possible by the development of a technology generally identified as "Very 

Large Scale Iminobilized Polymer Synthesis" fVLSIPS'"^) in which, typically, probes are 

i 

immobilized! in a high density array on a solid surface of a chip. Examples of VLSIPS™ technologies 
are provided in US Patents 5,143,854 and 5,412,087 and in PCT Publications WO 90/15070, WO 

5 92/10092 and WO 95/11995, which describe methods for forming oligonucleotide arrays through 
techniques such as light-directed synthesis techniques. In designing strategies aimed at providing 
arrays of nucleotides immobilized on solid supports, further presentation strategies were developed to 
order and display the oligonucleotide arrays on the chips in an attempt to maximize hybridization 
patterns and sequence information. Examples of such presentation strategies are disclosed in PCT 

10 Publications WO 94/12305, WO 94/11530; WO 91119111 and WO 97/31256. 

Oligonucleotide arrays may comprise at least one of the sequences selected from the group 
consisting of SEQ ID NOs: 3, 21-38, 57-62, 100-124, 179, 185-338, the preferred hybridization 
probes for genie PGl-related biallelic markers described above; and the sequences complementary 
thereto; or a lfragment thereof of at least 15 consecutive nucleotides for determining whether a sample 

15 contains onej or more alleles of the biallelic markers of the present invention. Oligonucleotide arrays 
may also coyiprise at least one of the sequences selected from the group consisting of SEQ ID NOs; 
179, 339-4214; and the sequences complementary thereto or a fragment thereof of at least 15 
consecutive nucleotides for amplifying one or more alleles of the PGl-realted biallelic markers. In 
other embodiments, arrays may also comprise at least one of the sequences selected from the group 

20 consisting of SEQ ID 425-578, the preferred microsequencing primers for genie PGl-related biallelic 
markers described above; and the sequences complementary thereto or a fragment thereof of at least 
15 consecutive nucleotides for conducting microsequencing analyses to determine whether a sample 
contains one or more alleles of PGl-related biallelic marker. 

The present invention further encompasses polynucleotide sequences that hybridize to any 

25 one of SEQ ID NOs: 3, 69, 100-112, or 179-184 under conditions of high or intermediate stringency 
as described below: 

(i) By way of example and not limitation, procedures using conditions of high stringency are 
as follows: B*rehybridization of filters containing DNA is carried out for 8 h to overnight at 65**C in 
buffer comppsed of 6X SSC, 50 mM Tris-HCl (pH7.5), 1 mM EDTA, 0.02% PVP, 0.02% Ficoll, 

30 0.02% BSA,i and 500 pg/ml denatured salmon sperm DNA. Filters are hybridized for 48 h at 65*'C, 
the preferred hybridization temperature, in prehybridization mixture containing 100 |ig/ml denatured 
salmon sperm DNA and 5-20 X 10^ cpm of ^^P-labeled probe. Alternatively, the hybridization step 
can be performed at 65°C in the presence of SSC buffer, 1 x SSC corresponding to 0.15M NaCl and 
0.05 M Na citrate. Subsequently, filter washes can be done at 37°C for 1 h in a solution containing 

35 IX SSC, 0.01% PVP, 0.01% FicoU, and 0.01% BSA, followed by a wash in O.IX SSC at 50'*C for 45 
min. Alternatively, filter washes can be performed in a solution containing 2 x SSC and 0.1% SDS, 
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or 0.5 X SSC and 0.1% SDS, or 0.1 x SSC and 0.1% SDS at 68°C for 15 minute intervals. Following 
the wash steps, the hybridized probes are detectable by autoradiography. Other conditions of high 
stringency which is used are well known in the art and as cited in Sambrook et al., 1989, Molecular 
Cloning, A Laboratory Manual, Second Edition, Cold Spring Harbor Press, N.Y.. pp. 9.47-9.57; and 
5 Ausubel et al., 1989, Current Protocols in Molecular Biology, Green Publishing Associates and Wiley 
Interscience, N.Y. Preferably, such sequences encode a homolog of a polypeptide encoded by one of 
0RF2 to ORF1297. In one embodiment, such sequences encode a mammalian PGl polypeptide. 

(ii) By way of example and not limitation, procedures using conditions of intermediate 
stringency are as foUows: FUters containing DNA are prehybridized, and then hybridized at a 
10 temperature pf 60°C in the presence of a 5 x SSC buffer and labeled probe. Subsequently, filters 
washes are performed in a solution containing 2x SSC at SCPC and the hybridized probes are 
detectable by autoradiography. Other conditions of intermediate stringency which is used are well 
known in the art and as cited in Sambrook et al., 1989, Molecular Cloning, A Laboratory Manual, 
Second Edition, Cold Spring Harbor Press, N.Y., pp. 9.47-9.57; and Ausubel et al., 1989, Current 
15 Protocols in Molecular Biology, Green Publishing Associates and Wiley Interscience, N.Y. 
Preferably, such sequences encode a homolog of a polypeptide encoded by one of SEQ ID NOs: 3, 69, 
100-112, or 179-184. In one embodiment, such sequences encode a mammalian PGl polypeptide. 

The present invention also encompasses diagnostic kits comprising one or more 
polynucleotides of the invention with a portion or all of tiie necessary reagents and instructions for 
20 genotyping a test subject by determining the identity of a nucleotide at a PGl-related biallelic marker. 
The polynucleotides of a kit may optionally be attached to a solid support, or be part of an array or 
addressable anay of polynucleotides. The kit may provide for the determination of the identity of tiie 
nucleotide at a marker position by any metiiod known in the art including, but not limited to, a 
sequencing assay method, a microsequencing assay method, a hybridization assay method, or an allele 
25 specific amp|lification method. Optionally such a kit may include instructions for scoring die results 
of the detennination with respect to the test subjects' risk of contracting a cancer or prostate cancer, 
or likely response to an anti-cancer agent or anti-prostate cancer agent, or chances of suffering from 
side effects to an anti-cancer agent or anti-prostate cancer agent. 

Use of PGl Nucle ic Acids as Reagents 
30 The PGl genomic DNA of SEQ ID NO: 179, the PGl cDNA of SEQ ID NO: 3, 1 12-124 and 

PGl alleles responsible for a detectable phenotype (such as those obtainable by the methods of Example 
12, and SEQ ID NO:69) can be used to prepare PCR primers for use in diagnostic techniques or genetic 
engineering methods such as those described above. Example 10 describes tiie use of die PGl genomic 
DNA of SEQ ID NO: 179. the PGl cDNA of SEQ ID NO: 3, 112-124 and PGl alleles responsible for a 
35 detectable phenotype (such as those obtainable by the mettiods of Example 12) in PCR amplification 
procedures. 
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Exaniple 10 

The PGl genomic DNA of SEQ ID NO: 179, the PGl cDNA of SEQ ID NO: 3, and PGl alleles 
responsible for a detectable phenotype (such as those obtainable by the methods of Exan^)le 12) is used 
to prepare PCR primers for a variety of applications, including isolation procedures for cloning nucleic 
5 acids capably of hybridizing to such sequences, diagnostic techniques and forensic techniques. The PCR 
primers comprise at least 10 consecutive bases of the PGl genomic DNA of SEQ ID NO: 179, the PGl 
cDNA of SEp ID NO: 3, 1 12-124 and PGl alleles responsible for a detectable phenotype (such as those 
obtainable by the methods of Example 12) or the sequences complementary thereto. Preferably, the PCR 
primers comprise at least 12, 15. or 17 consecutive bases of these sequences. More preferably, the PCR 

10 primers comprise at least 20-30 consecutive bases of the PGl genomic DNA of -SEQ ID NO: 179, the 
PGl cDNA of SEQ ID NO: 3, 1 12-124 and PGl alleles responsible for a detectable phenotype (such as 
those obtainable by the methods of Example 12) or the sequences complementary thereto. In some 
embodiments, the PCR primers may comprise more than 30 consecutive bases of the PGl genomic DNA 
of SEQ ID NO: 179, the PGl cDNA of SEQ ID NO: 3. 112-124 and PGl alleles responsible for a 

15 detectable phenotype (such as those obtainable by the methods of Example 12) or the sequences 
complementary thereto. It is preferred that the primer pairs to be used together in a PCR amplification 
have approximately the same G/C ratio, so that melting temperatures are approximately the same. 

A variety of PCR techniques are familiar to those skilled in the art. For a review of PCR 
technology, ^ee Molecular Cloning to Genetic Engineering White, B.A. Ed. in Methods in Molecular 

20 Biology 67: Humana Press, Totowa 1997. In each of these PCR procedures. PCR primers on either side 
of the nucleic acid sequences to be amplified are added to a suitably prepared nucleic acid sample along 
with dNTPsi and a thermostable polymerase such as Taq polymerase, Pfu polymerase, or Vent 
polymerase. IThe nucleic acid in the sample is denatured and the PCR primers are specifically hybridized 
to complementary nucleic acid sequences in the sample. The hybridized primers are extended. 

25 Thereafter, another cycle of denaturation, hybridization, and extension is initiated. The cycles are 
repeated multiple times to produce an amplified fragment containing the nucleic acid sequence between 
the primer sites. 

The polynucleotides of the Invention also encompass vectors and DNA constructs as well as 
other forms of primers and probes. For a thorough description of these embodiments please see 
30 Sections Vm, X, and XI below. 
m. POLYPEPTTOES 

PGl Proteins and Polvp ep tide Fragments 
The term "PGl polypeptides" is used herein to embrace all of the proteins and polypeptides of 
the present 1 invention. Also forming part of the invention are polypeptides encoded by the 
35 polynucleotides of the invention, as well as fiision polypeptides comprising such polypeptides. The 
invention embodies PGl proteins from human (SEQ ID NOs: 4, and 5). and mouse (SEQ ID NO: 74). 
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However, PQl species from other varieties of mammals are expressly contemplated and is isolated 
using the antibodies of the present invention in conjunction with standard affinity chromatography 
methods as ^^&U as being expressed from the PGl genes isolated from other mammalian sources using 
human and mouse PGl nucleic acid sequences as primers and probes as well as the methods described 
5 herein. 

The invention also embodies PGl proteins translated from less common alternative splice 
species, including SEQ ID NOs: 125-136, and PGl proteins which result from naturally occurring 
mutant, particularly functional mutants of PGl, including SEQ ID NO: 70. which is identified and 
obtained by the described herein. The present invention also embodies polypeptides comprising a 

10 contiguous stretch of at least 6 amino acids, preferably at leasf 8 to 10 amino acids, more preferably at 
least 12, 15, 20. 25, 50, or 100 amino acids of a PGl protein. In a preferred embodiment the 
contiguous stretch of amino acids comprises the site of a mutation or functional mutation, including a 
deletion, addition, swap or truncation of the amino acids in the PGl protein sequence. For instance, 
polypeptides that contain either the Arg and His residues at amino acid position 184, and polypeptides 

15 that contain ieither the Arg or He residue at amino acid position 293 of the SEQ ID NO: 4 in said 
contiguous stretch are particularly preferred embodiments of the invention and useful in the 
manufacture! of antibodies to detect the presence and absence of these mutations. Similarly, 
polypeptides with a carboxy terminus at position 228 is a particularly preferred embodiment of the 
invention arid useful in the manufacture of antibodies to detect the presence and absence of the 

20 mutation shown in SEQ ID NOs: 69 and 70. 

Similarly, polypeptides that that contain an peptide sequences of 8, 10, 12, 15, or 25 amino 
acids encoded over a naturally-occurring splice junction (the point at which two human PGl exon 
(SEQ ID NOs: 100-111) are covalently linked) in said contiguous stretch are particularly preferred 
embodiments and useful in the manufacture of antibodies to detect the presence, localization, and 

25 quantity of the various protein products of the PGl alternative splice species. 

PGl proteins are preferably isolated from human, mouse or mammalian tissue samples or 
expressed from human, mouse or manmialian genes. 

The PGl polypeptides of the invention can be made using routine expression methods known 
in the art, isee, for instance. Example 11, below. The polynucleotide encoding the desired 

30 polypeptide,: is ligated into an expression vector suitable for any convenient host. Both eukaryotic 
and prokaryotic host systems is used in forming recombinant polypeptides, and a summary of some of 
the more common systems are included in Sections H and Vm. The polypeptide is then isolated from 
lysed cells br from the culture medium and purified to the extent needed for its intended use. 
Purification is by any technique known in the art, for example, differential extraction, salt 

35 fractionation, chromatography, centrifugation, and the like. See, for example. Methods in 
Enzymology for a variety of methods for purifying proteins. 
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In addition, shorter protein fragments is produced by chemical synthesis. Alternatively the 
proteins of t^e invention is extracted from cells or tissues of humans or non-human animals. Methods 
for purifying proteins are known in the art, and include the use of detergents or chaotropic agents to 
disrupt particles followed by differential extraction and separation of the polypeptides by ion 
5 exchange chromatography, affinity chromatography, sedimentation according to density, and gel 
electrophoresis. 

Expression of thePGl Protein 
Any PGl cDNA, including SEQ ID NO: 3, 69, 112-124, or 184 or synthetic DNAs is use as 
described in Example 1 1 below to express PGl proteins and polypeptides. 

JO Example 11 r- 

The nucleic acid encoding the PGl protem or polypeptide to be expressed is operably linked to a 
promoter in; an expression vector using conventional cloning technology. The PGl insert in the 
expression vector may con^rise the full coding sequence for the PGl protein or a portion thereof. For 
example, the; PGl derived insert may encode a polypeptide comprising at least 10 consecutive ammo 

15 acids of the PGl proteins of SEQ ID NO: 4. 

The expression vector is any of the mammalian, yeast, msect or bacterial expression systems 
known in the art. see for example Section Vm. Commercially available vectors and expression systems 
are available from a variety of suppliers including Genetics Institute (Cambridge, MA), Stratagene (La 
JoUa, California), Promega (Madison, Wisconsin), and Invitrogen (San Diego, California), If desired, to 

20 enhance expression and facilitate proper protein folding, the codon context and codon pairing of the 
sequence is optimized for the particular expression organism in which the expression vector is 
introduced, as explained by Hatfield, et al., U.S. Patent No. 5,082,767. 

The following is provided as one exemplary method to express the PGl protein or a portion 
thereof. In one embodiment, the entire coding sequence of the PGl cDNA through the poly A signal of 

25 the cDNA are operably linked to a promoter in the expression vector, Altematively, if the nucleic acid 
encoding a portion of the PGl protein lacks a methionine to serve as the initiation site, an initiating 
methionine dan be introduced next to the first codon of the nucleic acid using conventional techniques. 
Similarly, if ithe insert from the PGl cDNA lacks a poly A signal, this sequence can be added to the 
constmct by, for example, splicing out the Poly A signal from pSG5 (Stratagene) using BgU and SaU 

30 restriction endonuclease enzymes and incorporating it into the mammalian expression vector pXTl 
(Stratagene); pXTl contains the LTRs and a portion of the gag gene from Moloney Murine Leukemia 
Virus. The position of the LTRs in the construct allow efficient stable transfection. The vector includes 
the Herpes Simplex Thymidine Kinase promoter and the selectable neomycin gene. The nucleic acid 
encoding the PGl protein or a portion thereof is obtained by PCR from a bacterial vector containing the 

35 PGl cDNA of SEQ ID NO: 3 using oligonucleotide primers complementary to the PGl cDNA or portion 
thereof and containing restriction endonuclease sequences for Pst I incorporated mto the 5* primer and 
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Bgin at the 5. end of the corresponding cDNA 3- primer, taking caie to ensure that the sequence 
encoding the PGl protein or a portion thereof is positioned properly with respect to the poly A signal. 
The purified fragment obtained from the resulting PCR reaction is digested with Pstl. blunt ended with 
an exonuclease. digested with Bgl n, purified and Ugated to pXTl, now containing a poly A signal and 
5 digested with BgUL 

The ligated product is transfected into mouse NIH 3T3 cells using Lipofectin (Life 
Technologies. Inc.. Grand Island, New York) under conditions outlined in the product specification. 
Positive transfectants are selected after growing the transfected cells in 600ug/ml G418 (Sigma, St. 

Louis, Missouri). 

i 

10 Alternatively, the nucleic acids encoding the PGl- protein or a portion thereof iif cloned into 

pED6dpc2 (Genetics Institute. Cambridge, MA). TTie resulting pED6dpc2 constracts is transfected into 
a suitable host cell, such as COS 1 cells. Methotrexate resistant cells are selected and expanded. 

TTie labove procedures may also be used to express a mutant PGl protein responsible for a 
detectable phenotype or a portion thereof. 

J 5 The expressed proteins is purified using conventional purification techniques such as ammonium 

sulfate precipitation or chromatographic separation based on size or charge. The protein encoded by the 
nucleic acid insert may also be purified using standard immunochromatography techniques. In such 
procedures, a solution containing the expressed PGl protein or portion thereof, such as a ceU extract, is 
applied to a column having antibodies against the PGl protein or portion thereof is attached to the 

20 chromatography matrix. The expressed protein is allowed to bind the immunochromatography column. 
Thereafter, the column is washed to remove non-specifically bound proteins. The specifically bound 
expressed protein is then released from the column and recovered using standard techniques. 

To confirm expression of the PGl protein or a portion thereof, the proteins expressed from host 
cells containing an expression vector containing an insert encoding the PGl protein or a portion thereof 

25 can be compared to the proteins expressed in host cells containing the expression vector without an 
insert. The presence of a band in samples from cells containing the expression vector with an insert 
which is absent in samples fi-om cells containing the expression vector without an insert indicates that 
the PGl protein or a portion thereof is being expressed. GeneraUy, the band wiU have the mobUity 
expected for the PGl protein or portion thereof. However, the band may have a mobUity different than 

30 that expected as a result of modifications such as glycosylation. ubiquitination, or enzymatic cleavage. 

Antibodies capable of specifically recognizing the expressed PGl protein or a portion thereof is 
generated as described below in Section Vn. 

If antibody production is not possible, the nucleic acids encoding the PGl proteiii or a portion 
thereof is incorporated into expression vectors designed for use in purification schemes employing 

35 chimeric polypeptides. In such strategies the nucleic acid encoding the PGl protein or a portion thereof 
is inserted in frame with the gene encoding the other half of the chimera. The other half of die chimera 
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is P-globin or a nickel binding polypeptide encoding sequence. A chromatography matrix having 
antibody to P-globin or nickel attached thereto is then used to purify the chimeric protein. Protease 
cleavage sites is engineered between the p-globin gene or the nickel binding polypeptide and the PGl 
protein or portion thereof. Thus, the two polypeptides of the chimera is separated from one another by 
5 protease digestion. 

One useful expression vector for generating p-globin chimerics is pSG5 (Stratagene), which encodes 
rabbit P-glob|in. Intron H of the rabbit p-globin gene facilitates splicing of the expressed transcript, and 
the polyadeiiylation signal incorporated into the construct increases the level of expression. These 
techniques are well known to those skilled in the art of molecular biology. Standard methods are 

1 0 published in methods texts such as Davis et al., (Basic Methods in Molecular Biology, L.G. Davis, M.D. 
Dibner, and J.F. Battey, ed., Elsevier Press, NY, 1986) and many of the methods are available from 
Stratagene, Life Technologies, Inc.. or Promega. Polypeptide may additionally be produced from the 
construct using in vitro translation systems such as the In vitro Express*^*^ Translation Kit (Stratagene). 
IV. roENTIFICATI ON OF MUTATIONS IN THE PGl GENE WHICH ARE ASSOCIATED 

15 WITH A DETECTABLE PHENOTYPE 

Mutations in the PGl gene which are responsible for a detectable phenotype is identified by 
comparing the sequences of the PGl genes from affected and unaffected individuals as described in 
Example 12, below. The detectable phenotype may comprise a variety of manifestations of altered 
PGl function, including prostate cancer, hepatocellular carcinoma, colorectal cancer, non-small cell 

20 lung cancer, squamous cell carcinoma, or other conditions. The mutations may comprise point 
mutations, 4eletions. or insertions of the PGl gene. The mutations may lie within the coding 
sequence for the PGl protein or within regulatory regions in the PGl gene. 

Example 12 

Oligonucleotide primers are designed to amplify the sequences of each of the exons or the 
25 promoter region of the PGl gene. The oligonucleotide primers may comprise at least 10 consecutive 
nucleotides of the PGl genomic DNA of SEQ ID NO: 179 or the PGl cDNA of SEQ ID NO: 3 or the 
sequences complementary thereto. Preferably, the oligonucleotides comprise at least 15 consecutive 
nucleotides of the PGl genomic DNA of SEQ ID NO: 179 or the PGl cDNA of SEQ ID NO: 3 or the 
sequences complementary thereto. In some embodiments, the oligonucleotides may comprise at least 
30 20 consecutive nucleotides of the PGl genomic DNA of SEQ ID NO: 179 or the PGl cDNA of SEQ 
ID N0:3 or the sequences complementary thereto. In other embodiments, the oligonucleotides may 
comprise 25 or more consecutive nucleotides of the PGl genomic DNA of SEQ ID NO: 179 or the 

PGl cDNA of SEQ ID NO: 3 or the sequences complementary thereto. 

I 

Each primer pair is used to amplify the exon or promoter region from which it is derived. 
35 Amplification is carried out on genomic DNA samples from affected patients and unaffected controls 
using the PGR conditions described above. Amplification products from the genomic PCRs are 
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subjected to; automated dideoxy terminator sequencing reactions and electrophoresed on ABI 377 
sequencers, j Following gel image analysis and DNA sequence extraction. ABI sequence data are 
automatically^ analyzed to detect the presence of sequence variations among affected and unaffected 
individuals. ; Sequences are verified by determining the sequences of both DNA strands for each 
5 individual. Preferably, these candidate mutations are detected by comparing individuals homozygous 
for haplotype 5 of Figure 4 and controls not carrying haplotype 5 or related haplotypes. 

Candidate polymorphisnis suspected of being responsible for the detectable phenotype, such 
as prostate cancer or other conditions, are then verified by screening a larger population of affected 
and unaffected individuals using the microsequencmg technique described above. Polymorphisms 

10 which exhibit a statistically significant correlation with the detectable phenotype are deemed 
responsible for the detectable phenotype. 

Other techniques may also be used to detect polymorphisms associated with a detectable 
phenotype such as prostate cancer or other conditions. For example, polymorphisms is detected using 
single stranded conformation analyses such as those described in Orita et al., Proc. Natl. Acad. Sci. 

15 U.SA. 86: 12776-2770 (1989). In this approach, polymorphisms are detected through altered 
migration on; SSCA gels. 

Alternatively, polymorphisms is identified using clamped denaturing gel electrophoresis, 
heteroduplex analysis, chemical mismatch cleavage, and other conventional techniques as described 
in Sheffield! et al, Proc. Natl. Acad. Sci. U.S.A 49:699-706 (1991); White, M.B. et al., 

20 Genomics 12:301-306 (1992); Grompe, M. et al., Proc. Natl. Acad. Sci. U.S.A 86:5855-5892 (1989); 
and Grompe, M. Nature Genetics 5:111-117 (1993). 

The PGl genes from individuals carrying PGl mutations responsible for the detectable 
phenotype, or cDNAs derived therefrom, is cloned as follows. Nucleic acid samples are obtained 
from individuals having a PGl mutation associated with the detectable phenotype. The nucleic acid 

25 samples are contacted with a probe derived from the PGl genomic DNA of SEQ ID NO: 179 or the 
PGl cDNA of SEQ ID N0:3. Nucleic acids containing the mutant PGl allele are identified using 
conventional techniques. For example, the mutant PGl gene, or a cDNA derived therefrom, is 
obtained by conducting an amplification reaction using primers derived from the PGl genomic DNA 
of SEQ ID ijfO: 179 or the PGl cDNA of SEQ ID N0:3. Alternatively, the mutant PGl gene, or a 

30 cDNA derivbd therefrom, is identified by hybridizing a genomic library or a cDNA library obtained 
from an individual having a mutant PGl gene with a detectable probe derived from the PGl genomic 
DNA of SE(|j ID NO: 179 or the PGl cDNA of SEQ ID NO: 3. Alternatively, the mutant PGl allele 
is obtained by contacting an expression library from an individual carrying a PGl mutation with a 
detectable antibody against the PGl proteins of SEQ ID NO: 4 or SEQ ID NO: 5 which has been 

35 prepared as described below. Those skilled in the art will appreciate that the PGl genomic DNA of 
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SEQ ID NO: 179, the PGl cDNA of SEQ ID NO: 3 and the PGl proteins of SEQ ID NOs: 4 and 5 is 
used in a variety of other conventional techniques to obtain the mutant PGl gene. 

In another embodiment the mutant PGl allele which causes a detectable phenotype can be 
isolated by obtaining a nucleic acid sample such as a genomic library or a cDNA library from an 
5 individual expressing the detectable phenotype. The nucleic acid sample can be contacted with one or 
more probes lying in the 8p23 region of the human genome. Nucleic acids in the sample which 
contain the PGl gene can be identified by conducting sequencing reactions on the nucleic acids which 
hybridize to the markers from the 8p23 region of the human genome. 

The region of the PGl gene containmg the mutation responsible for the detectable phenotype 

10 may also be used in diagnostic techniques such as those described below. For example, 
oligonucleotides containing the mutation responsible for the detectable phenotype is used in 
amplification or hybridization based diagnostics, such as those described herein, for detecting 
individuals suffering from the detectable phenotype or individuals at risk of developing the detectable 
phenotype at a subsequent time. In addition, the PGl allele responsible for the detectable phenotype 

75 is used in gene therapy as described herein. The PGl allele responsible for the detectable phenotype 
may also be cloned into an expression vector to express the mutant PGl protein a described herein. 

During the search for biallelic markers associated with prostate cancer, a number of 
polymorphic bases were discovered which lie within the PGl gene. The identities and positions of 
these polymorphic bases are listed as features in the accompanying Sequence Listing for the PGl 

20 genomic DNA of SEQ ID NO: 179. The polymorphic bases is used in the above-described diagnostic 
techniques for determining whether an individual is at risk for developing prostate cancer at a 
subsequent date or suffers from prostate cancer as a result of a PGl mutation. The identities of the 
nucleotides present at the polymorphic positions in a nucleic acid sample is determined using the 
techniques, such as microsequencing analysis, which are described above. 

25 It isj possible that one or more of these polymorphisms (or other polymorphic bases) is 

mutations \yhich are associated with prostate cancer. To detennine whether a polymorphism is 
responsible Ifor prostate cancer, the frequency of each of the alleles in individuals suffering from 
prostate canper and unaffected individuals is measured as described in the haplotype analysis above. 
Those mutations which occur at a statistically significant frequency in the affected population are 

30 deemed to be responsible for prostate cancer. 

cDNAs containing the identified mutant PGl gene is prepared as described above and cloned 
into expression vectors as described below. The proteins expressed from the expression vectors is 
used to generate antibodies specific for the mutant PGl proteins as described below. In addition, 
allele specific probes containing the PGl mutation responsible for prostate cancer is used in the 

35 diagnostic techniques described below. 

Genes sharing homology to the PGl gene is identified as follows. 
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Example 13 

Alternatively, a cDNA library or genomic DNA library to be screened for genes sharing 
homology to the PGl gene is obtained from a commercial source or made using techniques familiar to 
those skilled in the art. The cDNA library or genomic DNA library is hybridized to a detectable probe 
5 comprising at least 10 consecutive nucleotides from the PGl cDNA of SEQ ID N0:3, the PGl genomic 
DNA of SEQ ID NO: 179, or the sequences complementary thereto, using conventional techniques. 
Preferably, ti[ie probe comprises at least 12, 15, or 17 consecutive nucleotides from the PGl cDNA of 
SEQ ID NOa the PGl genomic DNA of SEQ ID NO: 179, or the sequences complementary thereto. 
More preferably, the probe comprises at least 20-30 consecutive nucleotides from the PGl cDNA of 

JO SEQ ID N0:'3, the PGl genomic DNA of SEQ ID NO: 179, of the sequences complementary thereto. In 
some embocfiments, the probe comprises more than 30 nucleotides from the PGl cDNA of SEQ ID 
NO:3, the PGl genomic DNA of SEQ ID NO: 179, or the sequences complementary thereto. 

Techniques for identifying cDNA clones in a cDNA library which hybridize to a given probe 
sequence are disclosed in Sambrook et al, Molecular Cloning: A Laboratory Manual 2d Ed., Cold 

15 Spring Harbor Laboratory Press, 1989. The same techniques is used to isolate genomic DNAs sharing 
homology with the PGl gene. 

Briefly, cDNA or genomic DNA clones which hybridize to the detectable probe are identified 
and isolated for further manipulation as follows. A probe comprising at least 10 consecutive nucleotides 
from the PGl cDNA of SEQ ID N0:3, the PGl genomic DNA of SEQ ID NO: 179, or the sequences 

20 complementary thereto, is labeled with a detectable label such as a radioisotope or a fluorescent 
molecule. Preferably, the probe comprises at least 12, 15, or 17 consecutive nucleotides from the PGl 
cDNA of SEjQ ID N0:3, the PGl genomic DNA of SEQ ID NO: 179, or the sequences complementary 
thereto. More preferably, the probe comprises 20-30 consecutive nucleotides from the PGl cDNA of 
SEQ ID N0:3, the PGl genomic DNA of SEQ ID NO: 179, or the sequences complementary thereto. In 

25 some embodraents, the probe comprises more than 30 nucleotides from the PGl cDNA of SEQ ID 
N0:3, the PCjI genomic DNA of SEQ ID NO: 179, or the sequences complementary thereto. 

Techniques for labeling the probe are well known and include phosphorylation with 
polynucleotide kinase, nick translation, in vitro transcription, and non-radioactive techniques. The 
cDNAs or genomic DNAs in the library are transferred to a nitrocellulose or nylon filter and denatured. 

30 After mcubation of the filter with a blocking solution, the filter is contacted with the labeled probe and 
incubated for a sufficient amount of time for the probe to hybridize to cDNAs or genomic DNAs 
containing a sequence capable of hybridizing to the probe. 

By varying the stringency of the hybridization conditions used to identify cDNAs or genomic 
DNAs which hybridize to the detectable probe. cDNAs or genomic DNAs having different levels of 

35 homology to the probe can be identified and isolated. To identify cDNAs or genomic DNAs having a 
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high degree of homology to the probe sequence, the melting temperature of the probe is calculated using 
the following formulas: 

For probes between 14 and 70 nucleotides in length the melting temperature ™ is calculated 
using the forjnula: Tm=81.5+16.6aog (Na+))+0.41(fraction G+C)-(600/N) where N is the length of the 
5 probe. 

If the hybridization is carried out in a solution containing formamide, the melting temperature is 
calculated using the equation Tm=81.5+16.6(log (Na+))+0.41 (fraction G+CHO.63% formamide)- 
(600/N) whei-e N is the length of the probe. 

Piehybridization is carried out in 6X SSC, 5X Denhardfs reagent, 0.5% SDS, lOQ. g denatured 
10 fragmented salmon sperm DNA or 6X SSC, 5X Denhardfs reagent, ff.5% SDS, lOQ. g denatured - 
fragmented sataion sperm DNA, 50% formamide. The formulas for SSC and Denhardfs solutions are 
listed in Sambrook et al., supra. 

Hybridization is conducted by adding the detectable probe to the prehybridization solutions 
listed above. Where the probe comprises double stranded DNA, it is denatured before addition to the 
15 hybridization solution. The filter is contacted with the hybridization solution for a sufficient period of 
time to allow the probe to hybridize to cDNAs or genomic DNAs containmg sequences complementary 
thereto or homologous thereto. For probes over 200 nucleotides in length, the hybridization is carried 
out at 15-25-. C below the Tm. For shorter probes, such as oligonucleotide probes, the hybridization is 
conducted atj 15-25. C below the Tm. Preferably, for hybridizations in 6X SSC, the hybridization is 
20 conducted at approximately 68* C. Preferably, for hybridizations in 50% formamide containing 
solutions, thes hybridization is conducted at approximately 42« C. 

All of the foregoing hybridizations would be considered to be under "stringenf ' conditions. 
FoUowing hybridization, the filter is washed in 2X SSC, 0.1% SDS at itjom temperature for 15 
minutes. The filter is then washed with O.IX SSC, 0.5% SDS at room temperature for 30 minutes to 1 
25 hour. Thereafter, the solution is washed at the hybridization temperature in O.IX SSC. 0.5% SDS. A 
fmal wash is conducted in O.IX SSC at room temperature. 

cDNAs or genomic DNAs homologous to the PGl gene which have hybridized to the probe are 
identified by autoradiography or other conventional techniques. 

The above procedure is modified to identify cDNAs or genomic DNAs having decreasing levels 
30 of homology to the probe sequence. For example, to obtain cDNAs or genomic DNAs of decreasing 
homology to the detectable probe, less stringent conditions is used. For example, the hybridization 
temperature |s decreased in increments of 5« C from 68. C to 42' C in a hybridization buffer having a 
Na+ concentiration of approximately IM. FoUowing hybridization, the filter is washed with 2X SSC, 
0.5% SDS at the temperature of hybridization. These conditions are considered to be "moderate" 
35 conditions above 50* C and "low" conditions below 50* C. 
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Alternatively, the hybridization is carried out in buffers, such as 6X SSC. containing fonnamide 
at a temperature of 42- C. In this case, the concentration of formamide in the hybridization buffer is 
reduced in 5fo increments from 50% to 0% to identify clones having decreasing levels of homology to 
the probe. Following hybridization, the filter is washed with 6X SSC, 0.5% SDS at 50« C. These 
5 conditions are considered to be "moderate" conditions above 25% formamide and "low" conditions 
below 25% formamide. 

cDNAs or genomic DNAs which have hybridized to the probe are identified by autoradiography. 

If it is desired to obtain nucleic acids homologous to the PGl gene, such as allelic variants 
thereof or nucleic acids encoding proteins related to the PGl protein, the level of homology between the 
10 hybridized nucleic acid and the PGl gene may readUy be determined. To deteimine^he level of 
homology between the hybridized nucleic acid and the PGl gene, the nucleotide sequences of the 
hybridized nucleic acid and the PGl gene are compared. For example, using the above methods, nucleic 
acids having at least 95% nucleic acid homology to the PGl gene is obtained and identified. Similarly, 
by using progressively less stringent hybridization conditions one can obtain and identify nucleic acids 
15 having at lea?t 90%. at least 85%, at least 80% or at least 75% homology to the PGl gene. 

To cfetenmne whether a clone encodes a protein having a given amount of homology to the PGl 
protem, the ^no acid sequence of the PGl protein is compared to the ammo acid sequence encoded by 
the hybridiziiig nucleic acid. Homology is determined to exist when an amino acid sequence in the PGl 
protein is closely related to an amino acid sequence in the hybridizmg nucleic acid. A sequence is 
20 closely related when it is identical to that of the PGl sequence or when it contains one or more amino 
acid substitutions therein in which amino acids having similar characteristics have been substituted for 
one another. Using the above methods, one can obtain nucleic acids encoding proteins having at least 
95%, at least 90%, at least 85%, at least 80% or at least 75% homology to the proteins encoded by the 
PGl probe. 

Isoiation and I Lse of Mutant or Low Freguenrv P G l Alleles frnm Mammalian Prns tate Tiimnr Tk^n^, 

and Cell lines 

A single mutant PGl gene was isolated fi-om a human prostate cancer cell line. The nucleic acid 
sequence an^i amino acid sequence of this mutant PGl are disclosed in SEQ IN NOs: 69 and 70. 
respectively. This mutant was found to contain a stop codon at codon position number 229. and 

30 therefore results m a truncated gene product of only 228 amino acids. The present invention 
encompasses- purified or isolated nucleic acids comprising at least 8, 10. 12. 15. 20. or 25 consecutive 
nucleotides of SEQ ID NO: 69, preferably containing the mutation in codon number 229. A preferred 
embodiment of the present invention encompasses purified or isolated nucleic acids comprising at least 
8, 10, 12, 15. 20, or 25 consecutive nucleotides of SEQ ID NO: 71. 

35 The present invention is also directed to methods of determining whether an individual is at 

risk of developing prostate cancer at a later date or whether said individual suffers from prostate 
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cancer as a result of a mutation in the PGl gene comprising: obtaining a nucleic acid sample from 
said individual; and determining whether the nucleotides present at one or more of the polymorphic 
bases in the sequences selected from the group consisting of SEQ ID NOs: 69 and 71 are indicative of 
a risk of developing prostate cancer at a later date or indicative of prostate cancer resulting from a 
5 mutation in the PGl gene. The present invention also includes purified or isolated nucleic acids 
encoding at least 4, 8, 10, 12, 15, or 20 consecutive amino acids of the polypeptide of SEQ ID NO: 
70, preferably including the carboxy terminus of said polypeptide. The isolated or purified 
polypeptides of the invention include polypeptides comprising at least 4, 8, 10, 12, 15, or 20 
consecutive amino acids of the polypeptide of SEQ ID NO: 70, preferably including the carboxy 

10 terminus of said polypeptide. 

V, DIAGNOSIS OF INDIVroUALS AT RISK FOR DEVELOPING PROSTATE CANCER 
OR INDIVtDUALS SUFFERING FROM PROSTATE CANCER AS A RESULT OF A 
MUTATION IN THE PGl GENE 

Individuals may then be screened for the presence of polymorphisms in the PGl gene or 

15 protein which are associated with a detectable phenotype such as cancer, prostate cancer or other 
conditions as described in Example 13, below. The individuals is screened while they are 
asymptomatic to determine their risk of developing cancer, prostate cancer or other conditions at a 
subsequent time. Alternatively, individuals suffering from cancer, prostate cancer or other conditions 
is screened for the presence of polymorphisms in the PGl gene or protein in order to determine 

20 whether therapies which target the PGl gene or protein should be applied. 

Example 14 

Nucleic acid samples are obtained from a symptomatic or asymptomatic individual. The 

nucleic acid samples is obtained from blood cells as described above or is obtained from other tissues 

or organs. For individuals suffering from prostate cancer, the nucleic acid sample is obtained from 
25 the tumor. The nucleic acid sample may comprise DNA, RNA, or both. The nucleotides at positions 

in the PGl | gene where mutations lead to prostate cancer or other detectable phenotypes are 

determined for the nucleic acid sample. 

In one embodiment, a PGR amplification is conducted on the nucleic acid sample as described 

above to amplify regions in which polymorphisms associated with prostate cancer or other detectable 
30 phenotypes have been identified. The amplification products are sequenced to determine whether the 

individual possesses one or more PGl polymorphisms associated with prostate cancer or other 

detectable phenotypes. 

Altematively, the nucleic acid sample is subjected to raicrosequencing reactions as described 
above to determine whether the individual possesses one or more PGl polymorphisms associated with 
35 prostate cancer or another detectable phenotype resulting from a mutation in the PGl gene. 
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In another embodiment, the nucleic acid sample is contacted with one or more allele specific 
oligonucleotides which specifically hybridize to one or more PGl alleles associated with prostate 
cancer or another detectable phenotype. The nucleic acid sample is also contacted with a second PGl 
oligonucleotide capable of producing an amplification product when used with the allele specific 
5 oligonucleotide in an amplification reaction. The presence of an amplification product in the 
amplification reaction indicates that the individual possesses one or more PGl alleles associated with 
prostate cancer or another detectable phenotype. 

Determination of PGl Expression Levels 
As discussed above, PGl polymorphisms associated with cancer, prostate cancer or other 

10 detectable phenotypes may exert their effects by increasing, decreasing, or eliminating PGl 
expression, or in altering the frequency of various transcription species. Accordingly, PGl expression 
levels in individuals suffering from cancer, prostate cancer or other detectable phenotypes is 
compared to those of unaffected individuals to determine whether over-expression, under-expression, 
loss of expression, or changes in the relative frequency of transcription species of PGl causes cancer, 

15 prostate cancer or another detectable phenotype. Individuals is tested to determine whether they are at 
risk of developing cancer, or prostate cancer at a subsequent time or whether they suffer from prostate 
cancer resulting from a mutation in the PGl gene by determining whether they exhibit a level of PGl 
expression associated with prostate cancer. Similarly, individuals is tested to determine whether they 
suffer from another PGl mediated detectable phenotype or whether they are at risk of suffering from 

20 such a condition at a subsequent time. 

Exprjession levels in nucleic acid samples from affected and unaffected individuals is 
determined by performing Northern blots using detectable probes derived from the PGl gene or the 
PGl cDNA.; A variety of conventional Northern blotting procedures is used to detect and quantitate 
PGl expression and the frequencies of the various transcription species of PGl, including those 

25 disclosed in Current Protocols in Molecular Biology. John Wiley 503 Sons, Inc. 1997 and Sambrook et 
al. Molecular Qoning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press, 
1989. 

Altematively, PGl expression levels is determined as described in Example 15, below. 

Example 15 

30 Expression levels and patterns of PGl is analyzed by solution hybridization with long probes as 

described in International Patent Application No. WO 97/05277. Briefly, the PGl cDNA or the PGl 
genomic DNA described above, or fragments thereof, is inserted at a cloning site immediately 
downstream of a bacteriophage (T3, T7 or SP6) RNA polymerase promoter to produce antisense RNA. 
Preferably, the PGl insert comprises at least 100 or more consecutive nucleotides of the genomic DNA 

35 sequence of 3EQ ID NO: 1 or the cDNA sequences of SEQ ID NO: 3. The plasmid is linearized and 
transcribed in the presence of ribonucleotides comprising modified ribonucleotides (i.e. biotin-UTP and 
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DIG-UTP). An excess of this doubly labeled RNA is hybridized in solution with mRNA isolated from 
cells or tissues of interest. The hybridizations are performed under standard stringent conditions (40- 
50- C for 16 hours in an 80% formamide, 0.4 M NaCl buffer, pH 7-8). The unhybridized probe is 
removed by digestion with ribonucleases specific for single-stranded RNA (i.e. RNases CL3, Tl, Phy M, 
5 U2 or A). T^e presence of the biotin-UTP modification enables capture of the hybrid on a microtitration 
plate coated with streptavidin. The presence of the DIG modification enables the hybrid to be detected 
and quantified by ELBA usmg an anti-DIG antibody coupled to alkaline phosphatase. 

Quantitative analysis of PGl gene expression may also be perfomed using arrays as described 
in Sections n and X,. As used here, the term array means an arrangement of a plurality of nucleic acids 

10 of sufficient length to permit specific detection of expression of PGl mRNAs capable of hybridizing 
thereto. For example, the arrays may contain a plurality of nucleic acids derived from genes whose 
expression levels are to be assessed. The arrays may include the PGl genomic DNA of SEQ ID 
NO: 179, the PGl cDNA of SEQ ID NO:3 or the sequences complementary thereto or fragments thereof. 
The array may contain some or all of the known alternative splice or transcription species of FGl, 

15 includmg the species in SEQ ID NOs: 3, and 112-124 to determine the relative frequency of particular 
transcription: species. Alternatively, the array may contam polynucleotides which overiap all of the 
potential splice junctions, including, for example SEQ ID NOs: 137-178. so that the frequency of 
particular splice junctions can be determined and correlated with traits or used in diagnostics just as 
expressions levels are. Preferably, the fragments are at least 15 nucleotides in length. In other 

20 embodiments, the fragments are at least 25 nucleotides in length. In some embodiments, the fragments 
are at least 50 nucleotides in length. More preferably, the fragments are at least 100 nucleotides in 
length. In another preferred embodiment, the fragments are more than 100 nucleotides in length. In 
some embodiments the fragments is more than 500 nucleotides in length. 

For example, quantitative analysis of PGl gene expression is performed with a complementary 

25 DNA microarray as described by Schena et al. (Science 270:467-470, 1995; Proc. Nafl. Acad. Sci. 
U.S.A. 93:10614-10619. 1996). Full length PGl cDNAs or fragments thereof are amplified by PGR and 
arrayed from a 96-well microtiter plate onto sUylated microscope slides using high-speed robotics. 
Printed arrays are incubated in a humid chamber to allow rehydration of the array elements and rinsed, 
once in 0.2% SDS for 1 min, twice in water for 1 min and once for 5 min m sodium borohydride 

30 solution. The arrays are submerged in water for 2 min at 95» C, transferred into 0.2% SDS for 1 min, 
rinsed twice with water, air dried and stored in the dark at 25* C. 

Cell or tissue mRNA is isolated or commercially obtained and probes are prepared by a single 
round of reverse transcription. Probes arc hybridized to 1 cm^ microarrays under a 14 x 14 mm glass 
coverslip for 6-12 hours at 60« C. Arrays are washed for 5 min at 25- C in low stringency wash buffer 

35 (1 X SSC/0.2% SDS). then for 10 min at room temperature in high stringency wash buffer (0.1 x 
SSC/0.2% SDS). Arrays are scanned in 0,1 x SSC using a fluorescence laser scanning device fitted with 
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a custom filter set. Accurate differential expression measurements are obtained by taking the average of 
the ratios of two independent hybridizations. 

Quantitative analysis of PGl gene expression may also be performed with foil length PGl 
cDNAs or fragments thereof in complementary DNA arrays as described by Pietu et al. (Genome 
5 Research 6:492-503, 1996), The foil length PGl cDNA or fragments thereof is PGR amplified and 
spotted on membranes. Then, mRNAs originating from various tissues or cells are labeled with 
radioactive nucleotides. After hybridization and washing in controlled conditions, the hybridized 
mRNAs are detected by phospho-imaging or autoradiography. Duplicate experiments are performed and 
a quantitative analysis of differentially expressed mRNAs is then performed. 

10 Alternatively, expression analysis using the' PGl genomic DNA, the PGl eDNA, or fragments 

thereof can be done through high density nucleotide arrays as described by Lockhart et al. (Nature 
Biotechnology 14: 1675-1680, 1996) and Sosnowsky et al. (Proc. Natl. Acad. Sci. 94:1119-1123, 1997). 
Oligonucleotides of 15-50 nucleotides from the sequences of the PGl genomic DNA of SEQ ID NO: 
179, the PGl cDNA of SEQ ID NO: 3, 1 12-124 or the sequences complementary thereto, are synthesized 

1 5 directly on the chip (Lockhart et al., supra) or synthesized and then addressed to the chip (Sosnowski et 
al., supra). 

PGl cDNA probes labeled with an appropriate compound, such as biotin, digoxigenin or 
fluorescent dye, are synthesized from the appropriate mRNA population and then randomly fragmented 
to an average size of 50 to 100 nucleotides. The said probes are then hybridized to the chip. After 
20 washing as described in Lockhart et al., supra and application of different electric fields (Sosnowsky et 
. al, Proc. Naa. Acad. Sci. 94:1119-1123)., the dyes or labeling compounds are detected and quantified. 
Duplicate hybridizations are performed. Comparative analysis of the intensity of the signal originating 
from cDNA probes on the same target oligonucleotide in different cDNA samples indicates a differential 
expression of PGl mRNA. 

25 The .above methods may also be used to determine whether an individual exhibits a PGl 

expression pkttem associated with cancer, prostate cancer or other diseases. In such methods, nucleic 
acid sample? from the individual are assayed for PGl expression as described above. If a PGl 
expression pattern associated with cancer, prostate cancer, or another disease is observed, an appropriate 
diagnosis is rendered and appropriate therapeutic techniques which target the PGl gene or protein is 

30 applied. 

The above methods may also be applied using allele specific probes to determine whether an 
individual possesses a PGl allele associated with cancer, prostate cancer, or another disease. In such 
approaches, one or more allele specific oligonucleotides containing polymorphic nucleotides in the PGl 
gene which are associated with prostate cancer are fixed to a microarray. The array is contacted with a 
35 nucleic acid sample from the individual being tested under conditions which permit allele specific 
hybridization of the sample nucleic acid to the allele specific PGl probes. Hybridization of the sample 
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nucleic acid to one or more of the allele specific PGl probes indicates that the individual suffers from 
prostate cancer caused by the PGl gene or that the individual is at risk for developing prostate cancer at 
a subsequent time. Alternatively, any of the genotyping methods described in Section X is utilized.. 
Use of the Biallelic Markers Of Ths TnvPtition In Diap insrirs 

5 The biallelic markers of the present invention can also be used to develop diagnostics tests 

capable of identifying individuals who express a detectable trait as the result of a specific genotype or 
individuals whose genotype places them at risk of developing a detectable trait at a subsequent time. 

The diagnostic techniques of the present invention may employ a variety of methodologies to 
determine whether a test subject has a biallelic marker pattern associated with an increased risk of 

10 developing ai detectable trait or wh€ther the individual suffers froth a detectable trait as a resulfof a 
particular mutation, including methods which enable the analysis of individual chromosomes for 
haplotyping,: such as family studies, single sperm DNA analysis or somatic hybrids. The trait 
analyzed using the present diagnostics is any detectable trait, cancer, prostate cancer or another 
disease, a response to an anti-cancer, or anti-prostate cancer, or side effects to an anti-cancer or anti- 

15 prostate cancer agent. Diagnostics, which analyze and predict response to a drug or side effects to a 
drug, is used to determine whether an individual should be treated with a particular drug. For 
example, if the diagnostic indicates a likelihood tiiat an individual will respond positively to treatment 
witii a particular drug, the drug is administered to the individual. Conversely, if the diagnostic 
indicates that an individual is likely to respond negatively to treatment witii a particular drug, an 

20 alternative course of treatment is prescribed. A negative response is defined as eitiier the absence of 
an efficacious response or the presence of toxic side effects. 

Clinical drag trials represent another application for the markers of the present invention. 
One or more markers indicative of response to an anti-cancer or anti-prostate cancer agent or to side 
effects to an anti-cancer or anti-prostate cancer agent is identified using tiie metiiods described in 

25 Section XI, below. Thereafter, potential participants in clinical trials of such an agent is screened to 
identify those individuals most likely to respond favorably to the drag and exclude those likely to 
experience side effects. In tiiat way, the effectiveness of drug treatment is measured in individuals 
who respond positively to the drug, witiiout lowering the measurement as a result of tiie inclusion of 
individuals who are unlikely to respond positively in the study and without risking undesirable safety 

30 problems. Preferably, in such diagnostic metiiods, a nucleic acid sample is obtained from tiie 
individual and this sample is genotyped using metiiods described in Section X. 

Anotfier aspect of tiie present invention relates to a metiiod of determining whetiier an 
individual is at risk of developing a trait or whetiier an individual expresses a trait as a consequence of 
possessing a particular trait-causing allele. The present invention relates to a metiiod of determining 

35 whetiier an individual is at risk of developing a plurality of traits or whetiier an individual expresses a 
plurality of traits as a result of possessing a particular trait-causing allele. These methods involve 
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obtaining a nucleic acid sample from the individual and determining whether the nucleic acid sample 
contains one or more alleles of one or more biallelic markers indicative of a risk of developing the 
trait or indicative that the individual expresses the trait as a result of possessing a particular trait- 
causing allele. 

5 As described herein, the diagnostics is based on a single biallelic marker or a group of 

biallelic markers. 

VI. ASSAYING THE PGl PROTEIN FOR INVOLVEMENT IN RECEPTOR/LIGAND 
INTERACTIONS 

The expressed PGl protein or portion thereof is evaluated for involvement in receptor/ligand 
10 interactions as described in Example 16 below. ' 

Example 16 

The proteins encoded by the PGl gene or a portion thereof may also be evaluated for their 
involvement in receptor/ligand interactions. Numerous assays for such involvement are familiar to those 
skilled in the art, including the assays disclosed in the following references: Chapter 7.28 (Measurement 

15 of Cellular Adhesion under Static Conditions 7.28.1-7.28,22) in Current Protocols in Immunology, J.E. 
Coligan et al. Eds. Greene Publishing Associates and Wiley-Interscience; Takai et al., Proc. Nad. Acad. 
Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168: 1 1454 156. 1988; Rosenstein et al., J. Exp. 
Med. 169:149-160, 1989; Stoltenborg et al., J. Immunol. Methods 175:59-68, 1994; Stitt et al., Cell 
80:661-670, 1995; Gyuris et al., CeU 75:791-803, 1993. 

20 For bxample, the proteins of the present invention may demonstrate activity as receptors, 

receptor ligands or inhibitors or agonists of receptor/ligand interactions. Examples of such receptors and 
ligands include, without limitation, cytokine receptors and tiieir ligands, receptor kinases and their 
ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions and their 
ligands (including without limitation, cellular adhesion molecules (such as sclectins, integrins and their 

25 ligands) and receptor/ligand pairs involved in antigen presentation, antigen recognition and development 
of cellular and humoral immune responses). Receptors and ligands are also useful for screening of 
potential peptide or small molecule inhibitors of the relevant receptor/ligand interaction. A protein of 
the present invention (including, without limitation, fragments of receptors and ligands) may themselves 
be useful as inhibitors of receptor/ligand interactions. 

30 The PGl protein or portions thereof described above is used in drug screening procedures to 

identify molecules which are agonists, antagonists, or inhibitors of PGl activity. The PGl protein or 

i 

portion thereof used in such analyses is free in solution or linked to a solid support. Alternatively, PGl 
protein or portions thereof can be expressed on a cell surface. The cell may naturally express the PGl 
protein or portion thereof or, alternatively, the cell may express the PGl protein or portion thereof from 
55 an expression vector such as those described below. 
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In one method of drug screening, eukaryotic or prokaryotic host cells which are stably transformed with 
recombinant polynucleotides in order to express the PGl protein or a portion thereof are used in 
conventional: competitive binding assays or standard direct binding assays. For example, the formation 
of a complex between the PGl protein or a portion thereof and the agent being tested is measured in 
5 direct binding assays. Alternatively, the ability of a test agent to prevent formation of a complex 
between the PGl protein or a portion thereof and a known ligand is measured. 

Alternatively, the high throughput screening techniques disclosed in the published PCT 
application WO 84/03564, is used. In such techniques, large numbers of small peptides to be tested for 
PGl binding activity are synthesized on a surface and affixed thereto. The test peptides are contacted 

10 with the PGl protein or a portion^thereof, followed by a wash step. The amount of PGl protein or 
portion thereof which binds to the test compound is quantitated using conventional techniques. 

In some methods, PGl protein or a portion thereof is fixed to a surface and contacted with a test 
compotmd. After a washing step, the amount of test compound which binds to the PGl protein or 
portion thereof is measured. 

15 In another approach, the three dimensional structure of die PGl protein or a portion thereof may 

be determined and used for rational drug design. 

Alternatively, the PGl protein or a portion diereof is expressed in a host cell using expression 
vectors such as those described herein. The PGl protein or portion thereof is an isotype which is 
associated with prostate cancer or an isotype which is not associated with prostate cancer. The cells 

20 expressing the PGl protein or portion thereof are contacted with a series of test agents and the effects of 
the test agents on PGl activity are measured. Test agents which modify PGl activity is employed in 
therapeutic treatments. 

The above procedures may also be applied to evaluate mutant PGl proteins responsible for a 
detectable phenotype. 

25 Identification of Proteins which Interact with the PGl Protein 

Proteins which interact with die PGl protein is identified as described in Example 17, below. 

Example 17 

Proteins which interact widi tiie PGl protein or a portion thereof, is identified using two hybrid 
systems such as die Matchmaker Two Hybrid System 2 (Catalog No. K1604-1, Clontech). As described 

30 in the manual accompanying the Matchmaker Two Hybrid System 2 (Catalog No. K1604-1, Clontech), 
nucleic acids encoding the PGl protem or a portion thereof, are inserted into an expression vector such 
diat they arc in frame with DNA encoding the DNA binding domain of the yeast transcriptional activator 
GAU, cDNAs in a cDNA library which encode proteins which might interact with the polypeptides 
encoded by the nucleic acids encoding the PGl protein or a portion thereof are inserted into a second 

35 expression vector such that they are in frame with DNA encoding die activation domain of GAL4. The 
two expression plasmids are transformed into yeast and the yeast are plated on selection medium which 
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selects for expression of selectable markers on each of the expression vectors as well as GAL4 
dependent expression of the HIS3 gene. Transformants capable of growing on medium lacking histidine 
are screened for GAM dependent lacZ expression. Those cells which are positive in both the histidine 
selection and the lacZ assay contain plasmids encoding proteins which interact with the polypeptide 
5 encoded by the nucleic acid inserts. 

Alternatively, the system described in Lustig et al,, Methods in Enzymology 283: 83-99 (1997), 
is used for identifying molecules which interact with the PGl protein or a portion thereof. In such 
systems, in vitro transcription reactions are performed on vectors containing an insert encoded the PGl 
protem or a portion thereof cloned downstream of a promoter which drives in vitro transcription. The 
10 resulting mRNA is introduced into Xenopus laevis oocytes. The oocytes are then assayed for a desired 
activity. 

Alternatively, the in vitro transcription products produced as described above is translated in 
vitro, The in vitro translation products can be assayed for a desired activity or for interaction with a 
known polypeptide. 

15 The system described in U.S. Patent No. 5,654,150 may also be used to identify molecules 

which interact with the PGl protein or a portion thereof. In this system, pools of cDNAs are transcribed 
and translated in vitro and the reaction products are assayed for interaction with a known polypeptide or 
antibody. 

Proteins or other molecules interacting with the PGl protein or portions thereof can be found 

20 by a variety of additional techniques. In one method, affinity columns containing the PGl protein or a 
portion thereof can be constructed. In some versions of this method the affinity column contains 
chimeric proteins in which the PGl protein or a portion thereof is fused to glutathione S-transferase. 
A niixture of cellular proteins or pool of expressed proteins as described above is applied to the 
affinity column. Proteins interacting with the polypeptide attached to the column can then be isolated 

25 and analyzed on 2-D electrophoresis gel as described in Ramunsen et al. Electrophoresis, 18, 588-598 
(1997). Alternatively, the proteins retained on the affinity column can be purified by electrophoresis 
based methods and sequenced. The same method can be used to isolate antibodies, to screen phage 
display products, or to screen phage display human antibodies. 

Proteins interacting with the PGl protein or portions thereof can also be screened by using an 

30 Optical Biosensor as described in Edwards et Leatherbarrow, Analytical Biochemistry, 246, 1-6 
(1997). The main advantage of the method is that it allows the determination of the association rate 
between the protein and other interacting molecules. Thus, it is possible to specifically select 
interacting molecules with a high or low association rate. Typically a target molecule is linked to the 
sensor surface (through a carboxymethl dextran matrix) and a sample of test molecules is placed in 

35 contact with the target molecules. The binding of a test molecule to the target molecule causes a 
change in the refractive index and/ or thickness. This change is detected by the Biosensor provided it 
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occurs in the evanescent field (which extend a few hundred nanometers from the sensor surface). In 
these screening assays, the target molecule can be the PGl protein or a portion thereof and the test 
sanople can be a collection of proteins extracted from tissues or ceUs, a pool of expressed proteins, 
combinatorial peptide and/ or chemical libraries, or phage displayed peptides. The tissues or cells 
5 from which the test proteins are extracted can originate from any species. 

In other methods, a target protein is immobUized and the test population is the PGl protein or 
a portion thebof. 

To spdy the interaction of the PGl protein or a portion thereof widi drugs, the microdialysis 
coupled to ^LC method described by Wang et al., Chromatographia, 44, 205-208(1997) or the 
10 affinity capillary electrophoresis method described by Buseh et al., J. Chromatogr. 777:311-328 
(1997). 

The above procedures may also be applied to evaluate mutant PGl proteins responsible for a 
detectable phenotype. 

m PRODUCTION OF AN TTOODIRSArtAlNSTPOl POT.YPEPTIDKS 

Any PGl polypeptide or whole protein (SEQ ID NOs: 4, 5. 70, 74, 125-136) whether 
human, mouse or mammalian is used to generate antibodies capable of specifically binding to 
expressed PGl protein or fragments thereof as described in Example 16, below. The antibodies is 
capable of binding the full length PGl protein. PGl proteins which result from naturally occurring 
mutant, particularly functional mutants of PGl, including SEQ ID NO: 70, which may used in the 

20 production c^- antibodies. The present invention also contemplates the use of polypeptides comprising 
a contiguous! ^^^^ °^ ^* ^ Pi^ferably at least 8 to 10 amino acids, more preferably 

at least 12, 15, 20, 25. 50, or 100 amino acids of any PGl protein in the manufacture of antibodies. In 
a preferred embodiment the contiguous stretch of amino acids comprises the site of a mutation or 
functional mutation, including a deletion, addition, swap or truncation of the amino acids in the PGl 

25 protein sequence. For instance, polypeptides tiiat contain either tiie Arg and ffis residues at amino 
acid position 184, and polypeptides tiiat contain either the Arg or De residue at amino acid position 
293 of die SEQ ID NO: 4 in said contiguous stretch are particularly preferred embodiments of the 
invention and useful in the manufacture of antibodies to detect the presence and absence of these 
mutations. Similarly, polypeptides with a carboxy terminus at position 228 is a particularly preferred 

30 embodiment of the invention and useful in the manufacture of antibodies to detect die presence and 
absence of the mutation shown in SEQ ID NOs: 69 and 70. Similarly, polypeptides Uiat tiiat contain 
an peptide sequences of 8, 10, 12, 15. or 25 amino acids encoded over a naturally-occurring spUce 
junction (tiie point at which two human PGl exon (SEQ ID NOs: 100-111) are covalently linked) in 
said contigubus stretch are particularly preferred embodiments and useful in the manufacture of 

35 antibodies tq detect the presence, localization, and quantity of tiie various protein products of tiie PGl 
alternative splice species. 
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Alternatively, the antibodies is screened so as to isolate those which are capable of binding an 
epitope-containing fragment of at least 8, 10, 12, 15, 20, 25, or 30 amino acids of a human, mouse, or 
mammalian PGl protein, preferably a sequence selected fromSEQ ID NOs: 4, 5, 70, 74, or 125-136. 

Antibodies may also be generated which are capable of specifically binding to a given isoform 
5 of the PGl protein. For example, the antibodies is capable of specifically binding to an isoform of the 
POr protein iwhich causes prostate cancer or another detectable phenotype which has been obtained as 
described above and expressed fi-om an expression vector as described above. Alternatively, the 
antibodies is capable of binding to an isoform of the PGl protein which does not cause prostate cancer. 
Such antibodies is used in diagnostic assays in which protein samples firom an individual are evaluated 

10 for the presence of an isoform of the PGl protein which causes cancer or another detectable phenotype 
using techniques such as Western blotting or ELISA assays. 

Non-human animals or mammals, whether wild-type or transgenic, which express a different 
species of PGl than the one to which antibody binding is desired, and animals which do not express 
PGl (i.e. an PGl knock out animal as described in Section VIE.) are particularly useful for preparing 

15 antibodies. PGl knock out animals will recognize all or most of the exposed regions of PGl as 
foreign antigens, and therefore produce antibodies with a wider array of PGl epitopes. The humoral 
immune system of animals which produce a species of PGl that resembles the antigenic sequence will 
preferentially recognize the differences between the animal's native PGl species and the antigen 
sequence, and produce antibodies to these unique sites in the antigen sequence. 

20 Example 18 

Substantially pure protein or polypeptide is isolated from transfected or transformed cells 
containing an expression vector encoding the PGl protein or a portion tiiereof as described in Example 
11. The concentration of protein in the final preparation is adjusted, for example, by concentration on an 
Amicon filter device, to the level of a few raicrograms/ml. Monoclonal or polyclonal antibody to the 

25 protein can then be prepared as follows: 

A. Monoclonal Antibodv Production bv Hvbridoma Fusion 

Monoclonal antibody to epitopes m the PGl protein or a portion thereof can be prepared from 
miuine hybridomas according to the classical method of Kohler, G. and Milstein, C, Nature 256:495 
(1975) or derivative methods thereof. Also see Harlow, E., and D. Lane. 1988. Antibodies A 

30 Laboratory Manual, Cold Spring Harbor Laboratory, pp. 53-242. 

Briefly, a mouse is repetitively inoculated with a few micrograms of the PGl protein or a portion 
thereof over a period of a few weeks. The mouse is then sacrificed, and the antibody producing cells of 
the spleen isolated. The spleen cells are fused by means of polyethylene glycol with mouse myeloma 
cells, and the excess unfused cells destroyed by growth of the system on selective media comprising 

35 aminopterin (HAT media). The successfully fused cells are diluted and aliquots of the dilution placed in 
wells of a microtiter plate where growth of the culture is continued. Antibody-producing clones are 
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identified by, detection of antibody in the supernatant fluid of the wells by immunoassay procedures, 
such as ELISA, as originally described by Engvall, E., Meth. Enzymol. 70:419 (1980), and derivative 
methods thereof. Selected positive clones can be expanded and their monoclonal antibody product 
harvested for use. Detailed procedures for monoclonal antibody production are described in Davis, L. et 
5 al. Basic Methods in Molecular Biology Elsevier, New York. Section 21-2. 
B. Polyclonal Antibody Production bv hnmunization 

Polyclonal antiserum containing antibodies to heterogeneous epitopes in the PGl protein or a 
portion thereof can be prepared by immunizing suitable non-human animal with the PGl protein or a 
portion thereof, which can be unmodified or modified to enhance innununogenicity. A suitable non- 
70 human animal is preferably a non-human mammal is selectcid, usually a mouse, rat, rabbif; goat, or 
horse. Alternatively, a crude preparation which has been enriched for PGl concentration can be used 
to generate antibodies. Such proteins, fragments or preparations are introduced into the non-human 
mammal in the presence of an appropriate adjuvant (e.g. aluminum hydroxide, RIBI, etc.) which is 
known in the art. Li addition the protein, fragment or preparation can be pretreated with an agent 
15 which will increase antigenicity, such agents are known in the art and include, for example, 
methylated bovine serum albumin (mBSA), bovine serum albumin (BSA), Hepatitis B surface 
antigen, and keyhole limpet hemocyanin (KLH). Serum from the immunized animal is collected, 
treated and tested according to known procedures. If the serum contains polyclonal antibodies to 
undesired epitopes, the polyclonal antibodies can be purified by immunoaffinity chromatography. 
20 Effective polyclonal antibody production is affected by many factors related both to the 

antigen and the host species. Also, host animals vary in response to site of inoculations and dose, 
with both inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng 
level) of antigen administered at multiple intradermal sites appears to be most reliable. Techniques 
for producing and processing polyclonal antisera are known in the art, see for example. Mayer and 
25 Walker (1987). An effective immunization protocol for rabbits can be found in Vaitukaitis, J. et al. J. 
Clin. Endocrinol. Metab. 33:988-991 (1971). 

Booster injections can be given at regular intervals, and antiserum harvested when antibody titer 
thereof, as determined semi-quantitatively, for example, by double immunodiffusion in agar against 
known concentrations of the antigen, begins to fall. See, for example, Ouchteriony, O. et al., Chap. 19 
30 in: Handbook of Experimental Immunology D. Wier (ed) Blackwell (1973). Plateau concentration of 
antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12 • M). Affinity of the antisera for 
the antigen is determined by preparing con^)etitive binding curves, as described, for example, by Fisher, 
D., Chap. 42 in: Manual of Clinical Immunology, 2d Ed. (Rose and Friedman, Eds.) Amer. Soc. For 
Microbiol., Washington. D.C. (1980). 
35 Antibody preparations prepared according to either the monoclonal or the polyclonal protocol 

are useful in quantitative immunoassays which determine concentrations of antigen-bearing substances 
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in biological samples; they are also used semi-quantitatively or qualitatively to identify the presence of 
antigen in a biological sample. The antibodies may also be used in therapeutic compositions for killing 
cells expressing the protein or reducing the levels of the protein in the body. 

VIII. VECTORS AND THE USES OF POLYNUCLEOTmES IN CELLS, ANIMALS. AND 
5 HUMANS 

The nucleic acids of the invention include expression vectors, amplification vectors, PCR- 
suitable polynucleotide primers, and vectors which are suitable for the introduction of a 
polynucleotide of the invention into an embryonic stem cells for the production of transgenic non- 
human animals. In addition, vectors which are suitable for the introduction of a polynucleotide of the 
10 invention into cells, organs and individuals, including human individuals, for the purposes of gene 
therapy to reduce the severity of or prevent genetic diseases associated with functional mutations in 
PGl genes are encompassed by the present invention. Functional mutations in PGl genes which are 
suitable as targets for the gene therapy and transgenic vectors and methods of the invention include, 
but are not limited to, mutations in the coding region of the PGl gene which affect the amino acid 
75 sequence of the PGl gene's product, mutations in the promoter or other regulatory regions which 
affect the levels of PGl expression, mutations in the PGl splice sites which affect length of the PGl 
gene product or the relative frequency of PGl altemative splicing species, and any other mutation 
which in any way affects the level or quality of PGl expression or activity. The gene therapy 
methods can be achieved by targeting vectors and method for changing a mutant PGl gene into a 
20 wild-type PGl gene in a embryonic stem cell or somatic cell. Alternatively, the present invention also 
encompasses methods and vectors for introducing the expression of wild-type PGl sequences without 
the disruption of any mutant PGl which already reside in the cell, organ or individual. 

The invention also embodies amplification vectors, which comprise a polynucleotide of the 
invention, and an origin of replication. Preferably, such amplification vectors further comprise 
25 restriction endonuclease sites flanking the polynucleotide, so as to facilitate cleavage and purification 
of the polynucleotides from the remainder of the amplification vector, and a selectable marker, so as 
to facilitate amplification of the amplification vector. Most preferably, the restriction endonuclease 
sites in the amplification vector are situated such that cleavage at those site would result in no other 
amplification vector fragments of a similar size. 
30 Thus, such an amplification vector is transfected into a host cell compatible with the origin of 

replication of said amplification vector, wherein the host cell is a prokaryotic or eukaryotic cell, 
preferably a mammalian, insect, yeast, or bacterial cell, most preferably an Escherichia coli cell. The 
resulting transfected host cells is grown by culture methods known in the art, preferably under 
selection compatible with the selectable marker (e.g., antibiotics). The amplification vectors can be 
35 isolated and purified by methods known in the art (e.g., standard plasmid prep procedures). The 
polynucleotide of the invention can be cleaved with restriction enzymes that specifically cleave at the 
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restriction endonuclease sites flanking the polynucleotide, and the double-stranded polynucleotide 
fragment purified by techniques known in the art, includmg gel electrophoresis. 

Alternatively linear polynucleotides comprising a polynucleotide of the invention is amplified 
by PGR. The PGR method is well known in the art and described in. e.g., U.S. Patent Nos. 4,683,195 
5 and 4,683,202 and Saiki, R et al. 1988. Science 239:487-491, and European patent applications 
86302298.4,:86302299.2 and 87300203.4, as well as Methods in Enzymology 1987 155:335-350. 

The polynucleotides of the invention can also be derivatized in various ways, including those 
appropriate for facilitating transfection and/or gene therapy. The polynucleotides can be derivatized 
by attaching a nuclear localization signal to it to improve targeted delivery to the nucleus. One well- 

10 characterized nuclear localization sigrfal is the heptapeptide PKKKRKV (pro-lys-lys-lys-arg-lys-val). 
Preferably, in the case of polynucleotides in the form of a closed circle, the nuclear localization signal 
is attached via a modified loop nucleotide or spacer that forms a branching structure. 

If it is to be used in vivo, the polynucleotide of the invention is derivatized to include ligands 
and/or delivery vehicles which provide dispersion through the blood, targeting to specific cell types, 

75 or permit easier transit of cellular barriers. Thus, the polynucleotides of the invention is linked or 
combined with any targeting or delivery agent known in the art, including but not limited to, cell 
penetration enhancers, lipofectin, liposomes, dendrimers, DNA intercalators, and nanoparticles. In 
particular, nanoparticles for use in the delivery of the polynucleotides of the invention are particles of 
less than about 50 nanometers diameter, nontoxic, non-antigenic, and comprised of albumin and 

20 surfactant, or iron as in the nanoparticle particle technology of SynGenix. In general the delivery 
vehicles used to target the polynucleotides of the invention may further comprise any cell specific or 
general targeting agents knovm in the art, and will have a specific trapping efficiency to the target 
cells or organs of from about 5 to about 35%. 

The polynucleotides of the invention is used ex vivo in a gene therapy method for obtaining 

25 cells or organs which produce wild-type PGl or PGl proteins which have been selectively mutated. 
The cells are created by incubation of the target cell with one or more of the above-described 
polynucleotides under standard conditions for uptake of nucleic acids, includmg electroporation or 
lipofection. In practicing an ex vivo mediod of treating ceUs or organs, the concentration of 
polynucleotides of the invention in a solution prepare to treat target cells or organs is from about 0.1 

30 to about 100 \}M, preferably 0.5 to 50 nM. most preferably from 1 to 10 \xM. 

Alternatively, the oligonucleotides can be modified or co-administered for targeted delivery to 
the nucleus.. Improved oligonucleotide stability is expected in the nucleus due to: (1) lower levels of 
DNases and RNases; and (2) higher oligonucleotide concentrations due to lower total volume. 

Alternatively, the polynucleotides of the invention can be covalently bonded to biotin to form 

I 

35 a biotin-polynucleotide prodmg by methods known in the art, and co-administered with a receptor 
ligand bound to avidin or receptor specific antibody bound to avidin, wherein the receptor is capable 
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of causing uptake of the resulting polynucleotide-biotin-avidin complex into the cells. Receptors that 

cause uptake are known to those of skill in the art 

! 

The invention encompasses vectors which are suitable for the introduction of a polynucleotide 
of the inveniion into an embiyonic stem cell for the production of transgenic non-human animals. 
5 which in tuii result in the expression of recombinant PGl in the transgenic animal. Any appropriate 
vector system can be used for the introduction and expression of PGl in transgenic animals, including 
for example yeast artificial chromosomes (YAC). bacterial artificial chromosomes (BAC), 
bacteriophage PI, and other vectors known in the art which are able to accommodate sufficiently 
large inserts to encode the PGl protein or desired fragments thereof. Selected alterations, additions 

10 and deletions in the PGl gene may optionally be achieved by site-directed mutagtoesis. Once an 
appropriate vector system is chosen, the site-directed mutagenesis process may then be conducted by 
techniques well known in the art , and the fragment be returned and ligated to the larger vector from 
which it was cleaved. For site directed mutagenesis methods see. for example, Kunkel, T. 1985. Proc. 
NaU. Acad. Sci. U.S.A. 82:488; Bandeyar, M. et al. 1988. Gene 65: 129-133; Nelson, M., and M. 

15 McClelland 3992. Methods Bnzymol. 216:279-303; Weiner, M. 1994. Gene 151: 119-123; Costa, G. 
and M. Weiqer. 1994. Nucleic Acids Res. 22: 2423; Hu. G. 1993. DNA and Cell Biology 12:763-770; 
and Deng. W. and J. Nickoff. 1992. Anal. Biochem. 200:81. 

Briejly, the transgenic technology used herein involves the inactivation. addition or 
replacement iof a portion of the PGl gene or the entire gene. For example the present technology 

20 includes the addition of PGl genes with or without the inactivation of the non-human animal's native 
PGl genes, as described in the preceding two paragraphs and in the Examples. The invention also 
encompasses the use of vectors, and the vectors themselves which target and modify an existing 
human PGl gene in a stem cell, whether it is contained in a non-human animal cell where it was 
previously introduced into the germ line by transgenic technology or it is a native PGl gene in a 

25 human pluripotent or somatic cell. This transgene technology usually relies on homologous 
recombination in a pluripotent ceU that is capable of differentiating into germ cell tissue. A DNA 
construct that encodes an altered region of the non-human animal's PGl gene that contains, for 
instance a stop codon to destroy expression, is introduced into the nuclei of embryonic stem cells. 
Preferably mice are used for this transgenic work. In a portion of tiie cells, the introduced DNA 

30 recombines jvith tiie endogenous copy of tiie cell's gene, replacing it with tiie altered copy. Cells 
containing the newly engineered genetic alteration are injected in a host embryo of tiie same species 
as the stem ;cell, and the embryo is reimplanted into a recipient female. Some of these embryos 
develop into chimeric individuals that posses germ cells entirely derived from the mutant ceU line. 
Therefore, by breeding tiie chimeric progeny it is possible to obtain a new strain containing tiie 

35 introduced genetic alteration. See Capecchi 1989. Science. 244:1288-1292 for a review of tiiis 
procedure. 
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The j present invention encompasses the polynucleotides described herein, as well as the 
methods for making these polynucleotides including the method for creating a mutation in a human 
PGl gene. In addition, the present invention encompasses cells which comprise the polynucleotides 
of the invention, including but not limited to amplification host cells comprising amplification vectors 
5 of the invention. Furthermore the present invention comprises the embryonic stem cells and 
transgenic non-human animals and mammals described herein which comprise a gene encoding a 
human PGl protein. 

DNA construct that enables directing tempor al and sp atial gene expression in recombinant host cells 

and in transgenic animals 

10 In order to study the physiofbgical and phenotype consequences of a lack of synthesis of the 

PGl protein,! both at the cellular level and at the multi-ceUular organism level, in particular as regards 
to disorders related to abnormal cell proliferation, notably cancers, the invention also encompasses 
DNA constructs and recombinant vectors enablmg a conditional expression of a specific allele of the 
PGl genomic sequence or cDNA and also of a copy of this genomic sequence or cDNA harboring 

15 substitutions; deletions, or additions of one or more bases as regards to the PGl nucleotide sequence 
of SEQ ID NOs: 3. 112-125, 179, 182-184, or a fragment thereof, these base substitutions, deletions 
or additions being located either in an exon, an intron or a regulatory sequence, but preferably in a 5*- 
regulatory sequence of a mammalian PGl gene, more preferably SEQ ID NO: 180 or in an exon of the 
PGl genomic sequence or within the PGl cDNA of SEQ ID NOs 3. 112-125, or 184, 

20 A first preferred DNA construct is based on the tetracycline resistance operon tet from E. coli 

transposon TnllO for controlling the PGl gene expression, such as described by Gossen M. et al., 
1992. Proc. Natl. Acad. Sci. USA, 89: 5547-5551; Gossen M. et al., 1995, Science, 268: 1766-1769; 
and Furth P.A. et al., 1994, Proc. Natl Acad. Sci USA, 91: 9302-9306. Such a DNA construct 

contains seven tet operator sequences from TnlO (tetop) that are fused to either a minimal promoter or 

i 

25 a 5'-regulatc|ry sequence of the PGl gene, said minimal promoter or said PGl regulatory sequence 
being operalj»ly linked to a polynucleotide of interest that codes either for a sense or an antisense 
oligonucleotide or for a polypeptide, including a PGl polypeptide or a peptide fragment thereof This 
DNA constdict is functional as a conditional expression system for the nucleotide sequence of interest 
when the same cell also comprises a nucleotide sequence coding for either the wild type (tTA) or the 

30 mutant (rTA) repressor fused to the activating domain of viral protein VP16 of herpes simplex virus, 
placed under the control of a promoter, such as the HCMVIEl enhancer/promoter or the MMTV- 
LTR. Indeed, a preferred DNA construct of the invention will comprise both the polynucleotide 
containing the tet operator sequences and the polynucleotide containing a sequence coding for the 
tTA or the iTA repressor. 
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In the specific embodiment wherein the conditional expression DNA construct contains the 
sequence encoding the mutant tetracycline repressor rTA, the expression of the polynucleotide of 
interest is silent in the absence of tetracyclme and induced in its presence. 

DNA constructs allowing homologous recombination : replacement vectors 
5 A second preferred DNA construct will comprise, from 5* -end to 3 '-end : (a) a first nucleotide 

sequence that is comprised of a PGl sequence preferably a PGl genomic sequence; (b) a nucleotide 
sequence comprising a positive selection marker, such as the marker for neomycin resistance (neo); 

and (c) a second nucleotide sequence that comprised of a PGl sequence preferably a PGl genomic 

I 

sequence, and is located on the genome downstream the first PGl nucleotide sequence (a). 

10 In a preferred embodiment, this DNA construct also comprises a negative selection marker 

located upstream the nucleotide sequence (a) or downstream the nucleotide sequence (b). Preferably, 
the negative selection marker consists of the thymidine kinase (tk) gene (Thomas K,R. et al., 1986, 
Cell, 44: 419-428), the hygromycin beta gene (Te Riele et al., 1990. Nature, 348: 649-651), the hprt 
gene (Van der Lugt et al., 1991, Gene, 105: 263-267; and Reid L.H. et al, 1990. Proc. Natl. Acad. Sci. 

75 USA, 87: 4299-4303) or the Diphteria toxin A fragment (Dt-A) gene (Nada S. et al., 1993, Cell, 73: 
1125-1135; Yagi T, et al., 1990. Proc. Natl. Acad. Sci. USA, 87: 9918-9922). Preferably, the positive 
selection marker is located within a PGl exon sequence so as to interrupt the sequence encoding a 
PGl protein. 

These replacement vectors are described for example by Thomas K.R. et al., 1986. Cell, 44: 

20 419-428; Thomas K.R. et al., 1987, Cell, 51: 503-512; Mansour S.L. et al., 1988, Nature, 336: 348- 
352; and KoUer et al., 1992, Annu. Rev. Immunol., 10: 705-30. 

The first and second nucleotide sequences (a) and (c) is located at any point within a PGl 
regulatory sequence, an intronic sequence, an exon sequence or a sequence containing both regulatory 
and/or intronic and/or exon sequences. The length of nucleotide sequences (a) and (c) is determined 

25 empirically by one of ordinary skill in the art. Nucleotide sequences (a) and (c) or any length are 
specifically contemplated in the present invention, however, lengths ranging from 1 kb to 50 kb, 
preferably from 1 kb to 10 kb, more preferably from 2 kb to 6 kb and most preferably from 2 kb to 4 
kb are normally used. 

DNA constructs allowing homologous recombination : Cre-loxP svstem. 

30 These new DNA constructs make use of the site-specific recombination system of the PI 

phage. The PI phage possesses a recombinase called Cre which interacts specifically with a 34 base 
pairs loxP site. The loxP site is composed of two palindromic sequences of 13 bp separated by a 8 bp 
conserved sequence (Hoess et al, 1986, Nucleic Acids Res., 14: 2287-2300). The recombination by 
the Cre enzyme between two loxP sites having an identical orientation leads to the deletion of the 

35 DNA fragment. 
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The Cre-loxP system used in combination with a homologous recombination technique has 
been first described by Gu H. et al.. 1993, Cell, 73: 1155-1164 ; and Gu H. et al.. 1994, Science. 265: 
103-106. Briefly, a nucleotide sequence of interest to be inserted in a targeted location of the genome 
harbors at least two loxP sites in the same orientation and located at the respective ends of a 
5 nucleotide sequence to be excised from the recombinant genome. The excision event requires the 
presence of jthe recombinase (Cre) enzyme within the nucleus of the recombinant host cell. The 
recombinasei enzyme is brought at the desired time either by (a) incubating the recombinant host cells 
in a culture medium containing this enzyme, by injecting the Cre enzyme directly into the desired cell, 
such as described by Araki K. et al., 1995, Proc. Natl. Acad. Sci. USA, 92: 160-164 ; or by Upofection 

10 of the enzyme into the cells, such is described by Baubonis et al.,1993, Nucleic Acids Res., 21: 2025- 
2029; (b) transfecting the cell host with a vector comprising the Cre coding sequence operably linked 
to a promoter functional in the recombinant cell host, which promoter being optionally inducible, said 
vector being introduced in the recombinant cell host, such as described by Gu H. et al., 1993, Cell, 73: 
1155-1164; and Sauer B. et al., 1988, Proc. Natl. Acad. Sci. USA, 85: 5166-5170; (c) introducing in 

75 the genome of the host cell a polynucleotide comprising the Cre coding sequence operably linked to a 
promoter functional in the recombinant cell host, which promoter is optionally inducible, and said 
polynucleotide being inserted in the genome of the cell host either by a random insertion event or an 
homologous recombination event, such as described by Gu H. et al.. 1994, Science, 265: 103-106. 

In the specific embodiment wherein the vector containing the sequence to be inserted in the 

20 PGl gene by homologous recombination is constructed in such a way that selectable markers are 
flanked by loxP sites of the same orientation, it is possible, by treatment by the Cre enzyme, to 
eliminate the selectable markers while leaving the PGl sequences of interest that have been inserted 
by an homologous recombination event. Again, two selectable markers are needed: a positive 
selection marker to select for the recombination event and a negative selection marker to select for the 

25 homologous recombination event. Vectors and methods using the Cre-loxP system are described by 
Zou Y.R. et al., 1994. Curr. Biol, 4: 1099-1103. 

Thus, a third preferred DNA construct of the invention comprises, from 5'-end to 3'-end: (a) a 
first nucleotide sequence that is comprised of a PGl sequence, preferably a PGl genomic sequence; 
(b) a nucleotide sequence comprising a polynucleotide encodmg a positive selection marker, such as 

30 the marker for neomycin resistance (neo), said nucleotide sequence comprising additionally two 
sequences defining a site recognized by a recombinase, such as a loxP site, the two sites being placed 
in the same orientation; and (c) a second nucleotide sequence that is comprised of a PGl sequence, 
preferably a^ PGl genomic sequence, and is located on the genome downstream of the first PGl 
nucleotide sequence (a). 

35 The sequences defining a site recognized by a recombinase, such as a loxP site, are preferably 

located within the nucleotide sequence (b) at suitable locations bordering the nucleotide sequence for 
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which the conditional excision is sought. In one specific embodiment, two loxP sites are located at 
each side of: the positive selection marker sequence, in order to allow its excision at a desired time 
after the occvirrence of the homologous recombination event. 

In a preferred embodiment of a method using the third DNA construct described above, the 
5 excision of the polynucleotide fragment bordered by the two sites recognized by a recorabinase, 
preferably two loxP sites, is performed at a desired time, due to the presence within the genome of the 
recombinant host cell of a sequence encoding the Cre enzyme operably linked to a promoter sequence, 
preferably an inducible promoter, more preferably a tissue-specific promoter sequence and most 
preferably a promoter sequence which is both inducible and tissue-specific, such as described by Gu 

10 H. et al, 1994, Science, 265: 103-106. 

The presence of the Cre enzyme within the genome of the recombinant cell host may result of 
the breeding of two transgenic animals, the first transgenic animal bearing the PGl -derived sequence 
of interest containing the loxP sites as described above and the second transgenic animal bearing the 
Cre coding sequence operably linked to a suitable promoter sequence, such as described by Gu H. et 

75 al., 1994, Science, 265: 103-106. Spatio-temporal control of the Cre enzyme expression may also be 
achieved witjh an adenovirus based vector that contains the Cre gene thus allowing infection of cells, 
or in vivo infection of organs, for delivery of the Cre enzyme, such as described by Anton M. et al., 
1995, J. Virol., 69: 4600-4606; and Kanegae Y. et al., 1995, Nucl. Acids Res., 23; 3816-3821. 

The DNA constructs described above is used to introduce a desired nucleotide sequence of 

20 the invention, preferably a PGl genomic sequence or a PGl cDNA sequence, and most preferably an 
altered copy of a PGl genomic or cDNA sequence, within a predetermined location of the targeted 
genome, leading either to the generation of an altered copy of a targeted gene (knock-out homologous 
recombination) or to the replacement of a copy of the targeted gene by another copy sufficiently 
homologous to allow an homologous recombination event to occur (knock-in homologous 

25 recombination). 

Nuclear antisense DNA constructs 
Preferably, the antisense polynucleotides of the invention have a 3' polyadenylation signal 
that has been replaced with a seLf-cleaving ribozyme sequence, such that RNA polymerase II 
transcripts aire produced without poly(A) at their 3' ends, these antisense polynucleotides being 
30 incapable ofexport from the nucleus, such as described by Liu Z. et al,, 1994, Proc. Natl, Acad. Sci. 
USA, 91: 4528-4262. In a preferred embodiment, these PGl antisense polynucleotides also comprise, 
within the ribozyme cassette, a histone stem-loop structure to stabilize cleaved transcripts against 3'- 
5' exonucleolytic degradation , such as described by Eckner R. et al., 1991, EMBO J., 10: 3513-3522. 

Expression Vectors 

35 The polynucleotides of the invention also include expression vectors. Expression vector 

systems, control sequences and compatible host are known in the art. For a review of these systems 
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see, for example, U.S. Patent No. 5,350,671, columns 45-48. Any of the standard methods known to 
those skilled in the art for the insertion of DNA fragments into a vector is used to construct expression 
vectors containing a chimeric gene consisting of appropriate transcriptional/translational control 
signals and the protein coding sequences. These methods may include in vitro recombinant DNA and 
5 synthetic techniques and in vivo recombinants (genetic recombination). 

Expression of a polypeptide, peptide or derivative, or analogs thereof encoded by a 
polynucleotide sequence in SEQ ID NOs: 3, 69. 100-112, or 179-184 is regulated by a second nucleic 
acid sequence so that the protein or peptide is expressed in a host transformed with the recombinant 
DNA molecule. For example, expression of a protein or peptide is controlled by any 

10 promoter/enhancer element known iii the art. Promoters which is tised to control expression include, 
but are not limited to, the CMV promoter, the SV40 early promoter region (Bemoist and Chambon, 
1981, Nature 220:304-310). the promoter contained in the 3' long terminal repeat of Rous sarcoma 
virus (Yamamoto, et al.. 1980, Cell 22:787-797), the herpes thymidine kinase promoter (Wagner et 
al„ 1981, Proc. Natl. Acad. Sci. U.S.A. 78:1441-1445), the regulatory sequences of the 

15 metallothionein gene (Brinster et a!., 1982, Nature 296:39-42); prokaryotic expression vectors such as 
the beta-lactamase promoter (Villa-Kamaroff, et al., 1978, Proc. Natl. Acad. Sci. U.S.A. 75:3727- 
3731), or the tac promoter (DeBoer, et al., 1983, Proc. Natl. Acad. Sci. U.S.A. 80:21-25); see also 
"Usefial proteins from recombinant bacteria" in Scientific American, 1980, 242:74-94: plant 
expression vectors comprising the nopaline synthetase promoter region (Herrera-Estrella et al., 1983, 

20 Nature 303:209-213) or the cauliflower mosaic virus 35S RNA promoter (Gardner, et al., 1981, Nucl. 
Acids Res. 9:2871), and the promoter of the photosynthetic enzyme ribulose biphosphate carboxylase 
(Herrera-Estrella et al., 1984, Nature 310:1 15-120); promoter elements from yeast or other fungi such 

as the Gal 4: promoter, the ADC (alcohol dehydrogenase) promoter, PGK (phosphoglycerol kinase) 

i 

promoter, alkaline phosphatase promoter, and the following animal transcriptional control regions, 
25 which exhibit tissue specificity and have been utilized in transgenic animals: elastase I gene control 
region whici is active in pancreatic acinar cells (Swift et al., 1984, Cell 38:639-646; Omitz et al., 
1986, Cold (Spring Harbor Symp. Quant. Biol. 50:399-409; MacDonald, 1987, Hepatology 2:425- 
515); insulin gene control region which is active in pancreatic beta cells (Hanahan, 1985, Nature 
315:115-122), immunoglobulin gene control region which is active in lymphoid cells (Grosschedl et 
30 al., 1984, Cell 38:647-658; Adames et al., 1985, Nature 318:533-538; Alexander et al., 1987, Mol. 
Cell. Biol. 7:1436-1444), mouse mammary tumor virus control region which is active in testicular, 
breast, lymphoid and mast cells (Leder et al., 1986, Cell 45:485-495), albumin gene control region 
which is active in liver (Pinkert et al., 1987, Genes and Devel. 1:268-276), alpha-fetoprotein gene 
control region which is active in liver (Krumlauf et al., 1985, Mol. Cell. Biol. 5:1639-1648; Hammer 
35 et al., 1987, Science 235:53-58; alpha l-antitrypsin gene control region which is active in the liver 
(Kelsey et al., 1987, Genes and Devel. 1:161-171), beta-globin gene control region which is active in 
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myeloid cells (Mogram et al., 1985. Nature 315:338-340: KoUias et al., 1986, Cell 46:89-94; myelin 
basic protein gene control region which is active in oligodendrocyte cells in the brain (Readhead et 
al., 1987, Cell 4S:703-712); myosin light chain-2 gene control region which is active in skeletal 
muscle (Sani, 1985, Nature 314:283-286), and gonadotropic releasing hormone gene control region 
5 which is active in the hypothalamus (Mason et al., 1986, Science 234:1372-1378). 

Other suitable vectors, particularly for the expression of genes in mammalian cells, is selected 
from the grpup of vectors consisting of PI bacteriophages, and bacterial artificial chromosomes 
(BACs). These types of vectors may contain large inserts ranging from about 80-90 kb (Pi 
bacteriophage) to about 300 kb (BACs). 
10 PI bacteriophage 

The construction of PI bacteriophage vectors such as pl58 or pl58/neo8 are notably 
described by Sternberg N.L., 1992, Trends Genet., 8: 1-16; and Sternberg N.L., 1994, Mamm. 
Genome, 5: 397-404. Recombinant PI clones comprising PGl nucleotide sequences is designed for 
inserting large polynucleotides of more than 40 kb (Linton M.F. et al., 1993, J. Clin. Invest., 92: 3029- 
15 3037). To generate PI DNA for transgenic experiments, a preferred protocol is the protocol described 
by McCormick et al.. 1994, Genet. Anal. Tech. Appl., 11: 158-164. Briefly, E, coli (preferably strain 
NS3529) harboring the PI plasmid are grown overnight in a suitable broth medium containing 25 
|ig/ml of kanamycin. The PI DNA is prepared from the E. coli by alkaline lysis using the Qiagen 
Plasmid Maxi kit (Qiagen, Chatsworth, CA, USA), according to the manufacturer's instmctions. The 
20 PI DNA is purified from the bacterial lysate on two Qiagen-tip 500 columns, using the washing and 
elution buffers contained in the kit. A phenol/chloroform extraction is then performed before 
precipitating the DNA with 70% ethanoL After solubilizing the DNA in TE (10 mM Tris-HCl, pH 7.4, 
1 mM EDTA), the concentration of the DNA is assessed by spectrophotometry. 

When the goal is to express a PI clone comprising PGl nucleotide sequences in a transgenic 
25 animal, typically in transgenic mice, it is desirable to remove vector sequences from the PI DNA 
fragment, for example by cleaving the PI DNA at rare-cutting sites within the PI polylinker (Sfil, 
NotI or Sail). The PI insert is then purified from vector sequences on a pulsed-field agarose gel, using 
methods similar using methods similar to those originally reported for the isolation of DNA from 
YACs (Schedl A. et al., 1993, Nature, 362: 258-261; and Peterson et al.. 1993, Proc. Natl. Acad. Sci. 
30 USA, 90: 7593-7597). At this stage, the resulting purified insert DNA can be concentrated, if 
necessary, on a Millipore Ultrafree-MC Filter Unit (Millipore, Bedford, MA, USA - 30,000 
molecular weight limit) and then dialyzed against microinjection buffer (10 mM Tris-HCl, pH 7.4; 
250 |iM EDTA) containing 100 mM NaCl, 30 ^iM spermine, 70 ^M spermidine on a microdyalisis 
membrane (type VS, 0.025 \iM from Millipore). The intactness of the purified PI DNA insert is 
35 assessed by electrophoresis on 1% agarose (Sea Kem GTG; FMC Bio-products) pulse-field gel and 
staining with ethidiura bromide. 
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Bacterial Artificial Chromosomes HBACs'^ 
The bacterial artificial chromosome (BAC) cloning system (Shizuya et al., 1992, Proc. Natl. 
Acad. Sci. USA, 89: 8794-8797) has been developed to stably maintain large fragments of genomic 
DNA (100-3p kb) in E. coli. A preferred BAC vector consists of pBeloBACl 1 vector that has been 
5 described Kib U. J., et al.. 1996, Genomics, 34: 213-218, BAC libraries are prepared vi^ith this vector 
using size-selected genomic DNA that has been partially digested using enzymes that permit ligation 
into either the Bam HI or Hind III sites in the vector. Ranking these cloning sites are T7 and SP6 
RNA polymerase transcription initiation sites that can be used to generate end probes by either RNA 
transcription or PCR methods. After the construction of a BAC library in E. coli, BAC DNA is 
10 purified from the host cell as a supercoiled circle. Converting these circular molecules into'a linear 
form precedes both size determination and introduction of the BACs into recipient cells. The cloning 
site is flanked by two Not I sites, permitting cloned segments to be excised from the vector by Not I 
digestion. Alternatively, the DNA insert contained in the pBeloBACll vector is linearized by 
treatment of the BAC vector with the commercially available enzyme lambda terminase that leads to 
75 the cleavage at the unique cosN site, but this cleavage method results in a full length BAC clone 
containing both the insert DNA and the BAC sequences. 

Host Cells 

The PGl gene expression in human cells is rendered defective, or alternatively it is proceeded 
with the insertion of a PGl genomic or cDNA sequence with the replacement of the PGl gene 

20 counterpart in the genome of an animal cell by a PGl polynucleotide according to the invention. 
These genetic alterations is generated by homologous recombination events using specific DNA 
constructs that have been previously described. 

One kind of host cell that is used are mammal zygotes, such as murine zygotes. For example, 
murine zygotes may undergo microinjection with a purified DNA molecule of interest, for example a 

25 purified DNA molecule that has previously been adjusted to a concentration range from 1 ng/ml -for 
BAC inserts- 3 ng/jil -for PI bacteriophage inserts- in 10 mM Tris-HCl, pH 7.4, 250 ^lM EDTA 
containing 100 mM NaCl, 30 fiM spermine, and70 ^iM spermidine. When the DNA to be 
microinjected has a large size, polyamines and high salt concentrations can be used in order to avoid 
mechanical breakage of this DNA, as described by Schedl et al., 1993, Nucleic Acids Res., 21: 4783- 

30 mi. 

Anyone of the polynucleotides of the invention, including the DNA constructs described 
herein, is introduced in an embryonic stem (ES) cell line, preferably a mouse ES cell line. ES cell 
lines are derived ft-om pluripotent, uncommitted cells of the inner cell mass of pre-implantation 
blastocysts. Preferred ES cell lines are the following : ES-E14TG2a (ATCC No. CRL-1821), ES-D3 
35 (ATCC No. CRL1934 and No. CRL-1 1632), YSOOl (ATCC No. CRL-1 1776), 36.5 (ATCC No. CRL- 
11116). To maintain ES cells in an uncommitted state, they are cultured in the presence of growth 
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inhibited feeder cells which provide the appropriate signals to preserve this embryonic phenotype and 
serve as a matrix for ES cell adherence. Preferred feeder cells consist of primary embryonic 
fibroblasts that are established from tissue of day 13- day 14 embryos of virtually any mouse strain, 
that are maintained in culture, such as described by Abbondanzo SJ et al., 1993, Methods in 
5 Enzymology, Academic Press, New York. pp. 803-823; and are mhibited in growth by irradiation, 
such as described by Robertson E., 1987, Embryo-derived stem cell lines. EJ. Robertson Ed. 
Teratocarcinomas and embrionic stem cells: a practical approach. IRL Press, Oxford, pp. 71, or by the 
presence of an inhibitory concentration of LIF, such as described by Pease S. and William R.S., 1990, 
Exp. Cell. Res., 190: 209-211. 

^0 Transgenic Animals; 

The terms "transgenic animals" or "host animals" are used herein designate non-human 
animals that have their genome genetically and artificially manipulated so as to include one of the 
nucleic acids according to the invention. Preferred animals are non-human mammals and include 
those belonging to a genus selected from Mus (e.g. mice), Rattus (e.g. rats) and Oryctogalus (e.g. 

75 rabbits) which have their genome artificially and genetically altered by the insertion of a nucleic acid 
according to the invention. 

The transgenic animals of the invention all include within a plurality of their cells a cloned 
recombinant or synthetic DNA sequence, more specifically one of the purified or isolated nucleic 
acids comprising a PGl coding sequence, a PGl regulatory polynucleotide or a DNA sequence 

20 encoding an antisense polynucleotide such as described in the present specification. 

Preferred transgenic animals according to the invention contains in their somatic cells and/or 
in their germ line cells a polynucleotide selected from the following group of polynucleotides: 

a) non-native, purified or isolated nucleic acid encoding a PGl polypeptide, or a polypeptide 
fragment or variant thereof. 

25 b) a non-native, purified or isolated nucleic comprising at least 8 consecutive nucleotides of 

the nucleotide sequence SEQ ID NOs: 179, 182, or 183, a nucleotide sequence complementary; in 
some embodiments, the length of the fragments can range from at least 8, 10, 15, 20 or 30 to 200 
nucleotides, ipreferably from at least 10 to 50 nucleotides, more preferably from at least 40 to 50 
nucleotides of SEQ ID NOs: 179, 182, or 183, or the sequence complementary thereto. In some 

30 embodiments, the fragments may comprise more than 200 nucleotides of SEQ ID NOs: 179, 182, or 
183, or the sequence complementary thereto. 

c) a non-native, purified or isolated nucleic acid comprising at least 8 consecutive nucleotides 
of the nucleotide sequence SEQ ID NOs: 3, 69, 1 12-125 or 184, a sequence complementary thereto or 
a variant thereof; In some embodiments, the length of the firagments can range from at least 8, 10, 15, 

35 20 or 30 to 200 nucleotides, preferably fi-om at least 10 to 50 nucleotides, more preferably from at 
least 40 to 50 nucleotides of SEQ ID NOs: 3. 69, 112-125 or 184, or the sequence complementary 
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thereto. In some embodiments, the fragments may comprise more than 200 nucleotides of SEQ ED 
NOs: 3, 69. 112-125 or 184, or the sequence complementary thereto. 

d) a non-native, purified or isolated nucleic acid comprising a nucleotide sequence selected 
from the group of SEQ TD NOs: 100 to 111, a sequence complementary thereto or a fragment or a 

5 variant thereof. 

e) a non-native, purified or isolated nucleic acid comprising a combination of at least two 
polynucleotides selected from the group consisting of SEQ E) NOs: 100 to 111, or the sequences 
complement?iry thereto wherein the polynucleotides are arranged within the nucleic acid, from the 5' 
end to the 3'end of said nucleic acid, in the same order than in SEQ NOs: 179, 182, or 183. 

10 f) a non-native, purified or isolated nucleic acid comprising the nucleotide sequence* SEQ ID 

NO: 180, or the sequences complementary thereto or a biologically active fragment or variant of the 
nucleotide sequence of SEQ ID NO: 180, or the sequence complementary thereto. 

^ g) a non-native, purified or isolated nucleic acid comprising the nucleotide sequence SEQ ID 
NO: 181, or the sequence complementary thereto or a biologically active fragment or variant of the 

75 nucleotide sequence of SEQ ID NO: 18 1 or the sequence complementary thereto. 

h) a polynucleotide consisting of : 

(1) a nucleic acid comprising a regulatory polynucleotide of SEQ ID NO: 180 or the sequences 
complementary thereto or a biologically active fragment or variant thereof 

(2) a polynucleotide encoding a desired polypeptide or nucleic acid. 

20 (3) Optionally, a nucleic acid comprising a regulatory polynucleotide of SEQ NO: 181, or the 
sequence complementary thereto or a biologically active fragment or variant thereof. 

i) a DNAiconstruct as described previously in the present specification. 

The transgenic animals of the invention thus contain specific sequences of exogenous genetic 
material or "non-native" such as the nucleotide sequences described above in detail. 
25 In a i first preferred embodiment, these transgenic animals is good experimental models in 

order to study the diverse pathologies related to cell differentiation, in particular concerning the 

i 

transgenic animals within the genome of which has been inserted one or several copies of a 
polynucleotide encoding a native PGl protein, or alternatively a mutant PGl protein. 

In a second preferred embodiment, these transgenic animals may express a desired 
30 polypeptide of interest under the control of the regulatory polynucleotides of the PGl gene, leading to 
good yields in the synthesis of this protein of interest, and eventually a tissue specific expression of 
this protein of interest. 

The design of the transgenic animals of the invention is made according to the conventional 
techniques well known from the one skilled in the art. For more details regarding the production of 
35 transgenic animals, and specifically transgenic mice, it is referred to Sandou et al. (1994) and also to 
US Patents Nos 4.873,191, issued Oct.lO, 1989, 5,464,764 issued Nov 7, 1995 and 5.789,215, issued 
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Aug 4. 1998. 

Transgenic animals of the present invention are produced by the application of procedures 
which result in an animal with a genome that has incoqrarated exogenous genetic material. The 
procedure involves obtaining the genetic material, or a portion thereof, which encodes either a PGl 
5 coding sequence, a PGl regulatory polynucleoUde or a DNA sequence encoding a PGl antisense 
polynucleotide such as described in the present specification. 

A recombinant polynucleotide of the invention is inserted into an embryonic or ES stem cell 
line. The insertion is preferably made using electroporation, such as described by Thomas K.R. et al., 
1987, Cell, 51: 503-512. The cells subjected to electroporation are screened (e.g. by selection via 
10 selectable markers, by PCR or by Southern blot analysis) to find positive ceUs which have integrated 
tiie exogenous recombinant polynucleotide into their genome, preferably via an homologous 
recombination event. An illustrative positive-negative selection procedure that is used according to 
the invention is described by Mansour S.L. et al., 1988, Nature, 336: 348-352. 

Then, the positive cells are isolated, cloned and injected into 3.5 days old blastocysts from 
15 mice, such as described by Bradley A., 1987, Production and analysis of chimaeric mice. In : E.J. 
Robertson (Ed.), Teratocarcinomas and embryonic stem cells : A practical approach. IRL Press, 
Oxford, pp.113. The blastocysts are then inserted into a female host animal and allowed to grow to 
term. 

Alternatively, tiie positive ES cells are brought into contact with embryos at the 2.5 days old 
20 8-16 cell stage (morulae) such as described by Wood S.A. et al., 1993, Proc. Natl. Acad. Sci. USA, 
90: 4582-45^5; or by Nagy A. et al., 1993, Proc. Natl. Acad. Sci. USA, 90: 8424-8428. The ES cells 
being intemilized to colonize extensively the blastocyst including the cells which will give rise to tiie 
germ line. The offspring of the female host are tested to determine which animals are tiansgenic e.g. 
include the inserted exogenous DNA sequence and which are wild-type. 
25 Thus, die present invention also concerns a tiansgenic animal containing a nucleic acid, a 

recombinant expression vector or a recombinant host cell according to tiie invention. 

Recombinant cell lines derived from flie transgenic animals of the invention 
A further object of tiie invention consists of recombinant host cells obtained fi-om a transgenic 
animal described herein. 

30 Recombinant cell lines is established in vitro from cells obtained from any tissue of a 

ti-ansgenic animal according to tiie invention, for example by transfection of primary cell cultures witii 
vectors expressing onc-genes such as SV40 large T antigen, as described by Chou J.Y., 1989, Mol. 
Endocrinol., i3: 151 1-1514 ; and Shay J.W. et al., 1991, Biochem. Biophys. Acta, 1072: 1-7. 
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Functional Analysis of the PGl Poplvpeptides In Transgenic Animals 
Using different BACs that contain the PGl gene, we performed HSH experiment on the 
adenocarcinoma prostatic cell line PC3. Only one signal could be detected showing that this region of 
chromosome 8 is hemizygous in this tumoral cell line. 
5 To study the function of PGl, it is inactivate by homologous recombination in the remaining 

allele of PGl in the PC3 cell line. To inactivate the remaining PGl allele, a knock-out targeting vector 
is generated by inserting two genomic DNA fragments of 3.0 and 4.3 kb (that correspond to a 
sequence upstream of the PGl promoter and to part of intron 1, respectively) in the pKO Scrambler 
Neo TK vector (Lexicon ref V1901). Since the targeting vector contains the neomycine resistance 
10 gene as well as the Tk gene, homologous recombination is selected by adding geneticin and FIAU to 
the medium. The promoter, the transcriptional start site, and the first ATG contained in exon 1 on the 
recombinant allele is deleted by homologous recombination between the targeting vector and the 
, remaining PGl allele, Accordingly, no coding transcripts is initiated from the recombinant allele. The 
parental PC3 cells as well as cells hemizygous for the null allele are assessed for their phenotype, 
15 their growth, rate in liquid culture, their ability to grow in agar (anchorage-independent growth) as 
well as their ability to form tumors and metastasis when injected subcutaneously in nude mice. 

To determine the function of PGl in the animal, and to generate an animal model for prostate 
tumorigenesis, mice in which tissue specific inactivation of the PGl alleles can be induced are 
generated. For this purpose, the Cre-loxP system is utilized as described above to allow chromosome 
20 engineering to be perform directly in the animal. 

First, to generate mice with a conditional null allele, two loxP sites are introduced in the 
murine genome, the first one 5' to the PGl promoter and the second one 3' to the PGl exon 1. 
Alternatively, to generate subtle mutations or to specifically mutate some isoforms, the loxP sites are 
introduced so that they flank any of the given exons or any potential set of exons. It is important to 
25 note that a functional PGl messenger can be transcribed from these alleles until a recombination is 
triggered between the loxP sites by the Cre enzyme. 

Second, to generate the inducer mice, the Cre gene is introduced in the mouse genome under 
the control of a tissue specific promoter, for example under the control of the PSA (prostate specific 
antigen) promoter. 

30 Finally, tissue specific inactivation of the PGl gene are induced by generating mice 

containing the Cre transgene that are homozygous for the recombinant PGl allele. 
Gene Therapy 

The present invention also comprises the use of the PGl genomic DNA sequence of SEQ ID 
NO: 179, the PGl cDNA of SEQ ID NO: 3, or nucleic acid encoding a mutant PGl protein responsible 
35 for a detectable phenotype in gene therapy strategies, including antisense and triple helix strategies as 
described in Examples 19 and 20, below. In antisense approaches, nucleic acid sequences 
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complementary to an mRNA are hybridized to the mRNA intracellularly, thereby blocking the 
expression of the protein encoded by the mRNA. The antisense sequences may prevent gene expression 
through a variety of mechanisms. For example, the antisense sequences may inhibit the ability of 
ribosomes to translate the mRNA. Alternatively, the antisense sequences may block transport of the 
5 mRNA from the nucleus to the cytoplasm, thereby limiting the amount of mRNA available for 
translation. Another mechanism through which antisense sequences may inhibit gene expression is by 
interfering with mRNA splicing. In yet another strategy, the antisense nucleic acid is incorporated m a 
ribozyme capable of specifically cleaving the target mRNA. 

Example 19 

10 Preparation and Use of Antisense OligonQcleotides 

The ! antisense nucleic acid molecules to be used in gene therapy is either DNA or RNA 
sequences. They may comprise a sequence complementary to the sequence of the PGl genomic DNA of 
SEQ K) NO: 179, the PGl cDNA of SEQ ID NO: 3. or a nucleic acid encoding a PGl protein 
responsible for a detectable phenoytpe. The antisense nucleic acids should have a length and melting 

75 temperature sufficient to permit formation of an intracellular duplex having sufficient stability to inhibit 
the expression of the PGl mRNA in the duplex; Strategies for designing antisense nucleic acids suitable 
for use in gene therapy are disclosed in Green et al., Ann. Rev. Biochem. 55:569-597 (1986) and Izant 
and Weintraub, Cell 36:1007-1015 (1984). 

In some strategies, antisense molecules are obtained by reversing the orientation of the PGl 

20 coding region with respect to a promoter so as to transcribe the opposite strand from that which is 
normally transcribed in the cell. The antisense molecules is transcribed using in vitro transcription 
systems such as those which employ T7 or SP6 polymerase to generate the transcript. Another approach 
involves transcription of PGl antisense nucleic acids in vivo by operably linking DNA contaming the 
antisense sequence to a promoter in an expression vector. 

25 Alternatively, oligonucleotides which are complementary to the strand of the PGl gene normally 

transcribed in the cell is synthesized in vitro, Thus, the antisense PGl nucleic acids are complementary 
to the PGl ImRNA and are capable of hybridizing to the mRNA to create a duplex. In some 
embodiments, the PGl antisense sequences may contain modified sugar phosphate backbones to increase 
stability and make them less sensitive to RNase activity. Examples of modifications suitable for use in 

30 antisense strategies are described by Rossi et al., Pharmacol. Ther. 50(2):245-254, (1991). 

Various types of antisense oligonucleotides complementary to the sequence of the PGl genomic 
DNA of SEQ ID NO: 179, the PGl cDNA of SEQ ID NO: 3, or a nucleic acid encoding a PGl protein 
responsible for a detectable phenoytpe is used. In one preferred embodiment, stable and semi-stable 
antisense oligonucleotides as described in International Application No. PCT WO94/23026, are used to 

35 inhibit the expression of the PGl gene. In these molecules, the 3* end or both the 3* and 5« ends are 
engaged in intramolecular hydrogen bondmg between complementary base pairs. These molecules are 
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better able to withstand exonuclease attacks and exhibit increased stability compared to conventional 

antisense oligonucleotides. 

In another preferred embodiment, the antisense oligodeoxynucleotides described in International 

Application No. WO 95/04141, are used to inhibit expression of the PGl gene. 
5 In yet another preferred embodiment, the covalently cross-linked antisense oligonucleotides 

described in International Application No. WO 96/31523, are used to inhibit expression of the PGl gene. 

These double- or single-stranded oligonucleotides comprise one or more, respectively, inter- or intm- 

oligonucleotide covalent cross-linkages, wherein the linkage consists of an amide bond between a 

primary amine group of one strand and a carboxyl group of the other strand or of the same strand, 
10 respectively,; the primary amine group being directly substituted in the 2' position of the strand 

nucleotide monosaccharide ring, and the carboxyl group being carried by an aliphatic spacer group 

substituted on a nucleotide or nucleotide analog of the other strand or the same strand, respectively. 

The antisense oligodeoxynucleotides and oligonucleotides disclosed in International Application 

No. WO 92/18522, may also be used to inhibit the expression of the PGl gene. These molecules are 
15 stable to degradation and contain at least one transcription control recognition sequence which binds to 

control proteins and are effective as decoys therefor. These molecules may contain "hairpin" structures, 

"dumbbell" structures, "modified dumbbell" stmctures, "cross-linked" decoy stmctures and "loop" 

stmctures. 

In another preferred embodiment, the cyclic double-stranded oligonucleotides described in 
20 European Patent Application No. 0 572 287 A2, are used to inhibit the expression of the PGl gene. 
These ligated oligonucleotide "dumbbells" contam the binding site for a transcription factor which binds 
to the PGl promoter and inhibits expression of the gene under control of the transcription factor by 
sequestering the factor. 

Use of the closed antisense oligonucleotides disclosed in International Application No, WO 
25 92/19732, is also contemplated. Because these molecules have no free ends, they are more resistant to 
degradation by exonucleases than are conventional oligonucleotides. These oligonucleotides is 
multifunctional, interacting with several regions which are not adjacent to the target mRNA. 

The appropriate level of antisense nucleic acids required to inhibit PGl gene expression is 
determined using in vitro expression analysis. The antisense molecule is introduced into the cells by 
30 diffusion, injection, infection or transfection using procedures known in the art. For example, the 
antisense nucleic acids can be introduced into the body as a bare or naked oligonucleotide, 
oligonucleotide encapsulated in lipid, oligonucleotide sequence encapsidated by viral protein, or as an 
oligonucleotide operably linked to a promoter contained in an expression vector. The expression vector 
is any of a variety of expression vectors known in the art, including retroviral or viral vectors, vectors 
35 capable of extrachromosomal replication, or integrating vectors. The vectors is DNA or RNA. 
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The PGl antisense molecules are introduced onto cell samples at a number of different 

concentrations preferably between 1x10'^*^ to IxlO*^. Once the minimum concentration that can 

adequately c6ntrol gene expression is identified, the optimized dose is translated into a dosage suitable 

1 - 
for use in vivo. For example, an inhibiting concentration in culture of 1x10* translates into a dose of 

5 approximately 0.6 mg/kg bodyweight. Levels of oligonucleotide approaching 100 mg/kg bodyweight or 

higher is possible after testing the toxicity of the oligonucleotide in laboratory animals. It is additionally 

contemplated that cells from the vertebrate are removed, treated with the antisense oligonucleotide, and 

remtroduced into the vertebrate. 

It is further contemplated that the PGl antisense oligonucleotide sequence is incorporated into a 
10 ribozyme sequence to enable the antisense to specifically bind and cleave its target mRNA. For 
technical applications of ribozyme and antisense oligonucleotides see Rossi et al., supra. 

In a preferred application of this invention, antibody-mediated tests such as RIAs and ELISA, 
functional assays, or radiolabeling are used to determine the effectiveness of antisense inhibition on PGl 
expression. 

15 The PGl cDNA, the PGl genomic DNA, and the PGl alleles of the present invention may also 

be used in gene therapy approaches based on intracellular triple helix fonnation. Triple helix 
oligonucleotijdes are used to inhibit transcription from a genome. They are particularly useful for 
studying alterations in cell activity as it is associated with a particular gene. The PGl cDNA, PGl 
genomic DNA, or PGl allele of the present invention or, more preferably, a portion of those sequences, 

20 can be used to inhibit gene expression in individuals suffering from prostate cancer or another detectable 
phenotype or individuals at risk for developing prostate cancer or another detectable phenotype at a later 
date as a result of their PGl genotype. Similarly, a portion of the PGl cDNA, the PGl genomic DNA, 
or the PGl alleles can be used to study the effect of inhibiting PGl transcription within a cell. 
Traditionally, homopurine sequences were considered the most useful for triple helix strategies, such as 

25 those described in Example 20, below. However, homopyrinnidine sequences can also inhibit gene 
expression. Such homopyrimidine oligonucleotides bind to the major groove at 
homopurine:homopyrimidine sequences. Thus, both types of sequences from the PGl cDNA, the PGl 
genomic DNA, and the PGl alleles are contemplated within the scope of this invention. 

Example 20 

30 The sequences of the PGl cDNA, the PGl genomic DNA, and the PGl alleles are scanned to 

identify lO-mer to 20-mer homopyrimidine or homopurine stretches which could be used in triple-helix 
based strategies for inhibiting PGl expression. Following identification of candidate homopyrimidine or 
homopurine stretches, their efficiency in inhibiting PGl expression is assessed by introducing varying 
amounts of (Oligonucleotides containing the candidate sequences into tissue culture cells which express 

35 the PGl gene. The oligonucleotides is prepared on an oligonucleotide synthesizer or they is purchased 
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commerciallji^ from a company specializing in custom oligonucleotide synthesis, such as GENSET, Paris. 
France. 

The ioligonucleotides is introduced into the cells using a variety of methods known to those 
skilled in the art, including but not limited to calcium phosphate precipitation, DEAE-Dextran, 
5 electroporation, liposome-mediated transfection or native uptake. 

Treated cells are monitored for altered cell function or reduced PGl expression using techniques 
such as Northem blotting, RNase protection assays, or PGR based strategies to monitor the transcription 
levels of the PGl gene in cells which have been treated with the oligonucleotide. 

The oligonucleotides which are effective in inhibiting gene expression in tissue culture cells 
10 may then be introduced in vivo using the techiiiques described above and in Example 19 at a dosage 
calculated based on the in vitro results, as described in Example 19. 

In some embodiments, the natural (beta) anomers of the oligonucleotide units can be replaced 
with alpha anomers to render the oligonucleotide more resistant to nucleases. Further, an intercalating 
agent such as ethidium bromide, or the like, can be attached to the 3' end of the alpha oligonucleotide to 
1 5 stabilize the triple helix. For information on the generation of oligonucleotides suitable for triple helix 
formation see Griffin et al. (Science 245:967-971 (1989). 

^ Alternatively, the PGl cDNA, the PGl genomic DNA, and the PGl alleles of the present 
invention is used in gene therapy approaches in which expression of the PGl protein is beneficial, as 
described in Example 21 below. 
20 Example 21 

The PGl cDNA, the PGl genomic DNA, and the PGl alleles of the present invention may also 
be used to express the PGl protein or a portion thereof in a host organism to produce a beneficial effect. 
In such procedures, the PGl protein is transiently expressed in the host organism or stably expressed in 
the host organism The expressed PGl protem is used to treat conditions resulting from a lack of PGl 
25 expression or conditions in which augmentation of existing levels of PGl expression is beneficial. 

A nucleic acid encoding the PGl proteins of SEQ ID NO: 4, SEQ ID N0:5, or a PGl allele is 
introduced into the host organism. The nucleic acid is introduced into the host organism using a variety 
of techniques known to those of skill in the art. For example, the nucleic acid is injected into the host 
organism as naked DNA such that the encoded PGl protein is expressed in the host organism, thereby 
30 producing a beneficial effect. 

Alternatively, the nucleic acid encoding the PGl proteins of SEQ ID NO: 4, SEQ ID NO: 5, or a 
PGl allele is cloned into an expression vector downstream of a promoter which is active in the host 
organism. The expression vector is any of the expression vectors designed for use in gene therapy, 
including viral or retroviral vectors. 
35 The expression vector is directly introduced into the host organism such that the PGl protein is 

expressed in the host organism to produce a beneficial effect. In another approach, the expression vector 
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is introduced into cells in vitro. Cells containing the expression vector are thereafter selected and 
introduced into the host organism, where they express the PGl protein to produce a beneficial effect. 
IX. ISOLATION OF PGl cDNA FROM NONHUMAN MAMMALS 

The present invention enconq)asses mammalian PGl sequences including genomic and cDNA 

5 sequences, as well as polypeptide sequences. The present invention also encompasses the use of PGl 
genomic and cDNA sequences of the invention, including SEQ ID NOs: 179, 3. 182, and 183, in 
methods of isolating and characterizing PGl nucleotide sequences derived firom nonhuman mammals, in 
addition to sequences derived from human sequences. The human and mouse PGl nucleic acid 
sequences of the invention can be used to construct primers and probes for amplifying and identifying 

10 PGl genes in other nonhuman animals particularly mammals. The primers and probes used to identify 
nonhuman PGl sequences is selected and used for the isolation of nonhuman PGl utilizing the same 
techniques described above in Examples 4, 5, 6, 12 and 13. 

In addition, sequence analysis of other homologous proteins is used to optimize the sequences of 
these primers and probes. As described above in the Analysis of the PGl Protein Sequence , three boxes 

15 of homology were identified in the structure of the PGl protein product when compared to proteins from 
a diverse range of organisms. See Figure 9. Using the assumption that the nucleotide sequences for 
these homologous proteins also show a high degree of homology, it is possible to construct primers that 
are specific for the PGR amplification of PGl cDNA in nonhuman mammals. 

Example 22 

20 The primers BOXIed: AATCATCAAAGCACAGTTGACTGGAT (SEQ ID NO: 77) and 

BOXffler: ATAAACCACCGTAACATCATAAATTGCATCTAA (SEQ ID NO: 78) were designed as 
PGR primers from the human PGl sequences after comparison with the sequence homologies of Figure 
9. The BOXIed (SEQ ID NO: 77) and BOXHIer (SEQ ID NO: 78) primers were used to amplify a 
mouse PGl cDNA sequence from mouse liver marathon-ready cDNA (Clontech) under the conditions 
25 described above in Example 4. This PGR reaction yielded a product of approximately 400 base pairs, 
the boxI-boTim fragment, which was subjected to automated dideoxy terminator sequencing and 
electrophoresed on ABI 377 sequencers as described above. Sequence analysis confirmed very high 
homology to human PGl both at the nucleic acid and protein levels. 

Primers were designed for RACE analysis using the 400 base pair boxl-boxlll fragment. Further 
30 sequence information was obtained using 5* and 3' RACE reactions on mouse liver marathon cDNA 
using two sets of these nested PGR primers: moPGlRACE5.350: AATCAAAAGCAACGTGAGTGGC 
(SEQ ID NO: 94) and moPGlRACE5.276: GCAAATGCCTGACTGGCTGA (SEQ ID NO: 93) for the 
5' RACE reaction and moPGlRACE3.18: CTGCCAGACAGGATGCCCTA (SEQ ID NO: 90) and 
moPGlRACE3.63: ACAAGTrAAAATGGCITCCGCTG (SEQ ID NO: 91) for the 3* RACE reaction. 
35 The PCR products of the RACE reactions were sequenced by primer walking using the following 
primers: 
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moPGrace3S473: GAGATAAAAG ATAGGTTGCT CA (SEQ ID NO: 79); 

moPGrace3S526: AAGAAACAAA TTTCCTGGG (SEQ E) NO: 80); 

moPGrace3S597: TCTTGGGGAG TTTGACTG (SEQ ED NO: 81); 

moPGrace5R323: GACCCCGGTG TAGTTCTC (SEQ ID NO: 82); 

5 nioPGrace5R372: CAGTAAAGCC GGTCGTC (SEQ ID NO: 83); 

moPGrace5R444: CAGGCCAGCA GGTAGGT (SEQ ID NO: 84); 

moPGrace5R492: AGCAGGTAGC GCATAGAGT (SEQ ID NO: 85). 

Agaui a high degree of homology between the mouse sequence obtained from the primer 
walking and the human PGl sequence was observed. An additional pair of nested primers were 
10 designed and utilized to further extend the 3* mouse^PGl sequence in yet another RACE reaction, 
moPG3RACE2: TGGGCACCTG GTTGTATGGA (SEQ ID NO: 95) and moPG3RACE2n: 
TCCTTGGCTG CCTGTGGnT (SEQ ID NO:96). The PGR product of this final RACE reaction 
was also sequenced by primer walking using the following primers: 

moPGlRACE3R94: CAAATGCATG TTGGCTGT (SEQ ID NO: 92); 

15 moPG3RACES20: GATGGCTACA CATTGTATCA C (SEQ ID NO: 97); 

moPG3RACES5: TCCTGAATTA AATAAGGAGT TTTC (SEQ ID NO: 98); 
moPG3RACES90: GTTTGTTATT AAAGCATAAG CAAG (SEQ ID NO: 99). 

The overlap in the 5' RACE, boxI-boxIH, and 3' RACE fragments allowed a single contiguous 

i 

coding sequeince for the mouse PGl ortholog to be generated alignment of the three fragments. Primers 
20 were chosenjfrom near the 5' and 3' ends of this predicted contiguous sequence (contig) in order to 
confirm the existence of such a transcript. PCR amplification was performed again on mouse liver 

i 

marathon-ready cDNA (Clontech) with the chosen primers, moPG15: TGGCGAGCCGAGAGGATG 
(SEQ ID n6: 87) and moPG13LR2: GGAAACAATGTGATACAATGTGTAGCC (SEQ ID NO: 86) 
under the PCR conditions described above in Example 4. The resulting PCR product was a roughly 1.2 

25 kb DNA molecule and was shown to have an identical sequence to that of the deduced contig. Finally 
modified versions of the moPG15 and moPG13LR2 primers with the addition of EcoRI and BamHI 
sites, moPG15EcoRI: CGTGAATTCTGGCGAGCCGAGAGGATG (SEQ ID NO: 89) and 
moPGlSBaml: CGTGGATCCGGAAACAATGTGATACAATGTGTAGCC (SEQ ID NO: 88) were 
used to obtain a PCR product that could be cloned into a pSKBluescript plasmid (Stratagene) cleaved 

30 with EcoRI and BamHI restriction enzymes. The mouse PGl cDNA in the resulting construct was 
subjected to automated dideoxy terminator sequencing and electrophoresed on ABI 377 sequencers as 

described above. The sequence for mouse PGl cDNA is reported in SEQ ID NO: 72, and the deduced 

i 

amino acid sequence corresponding to the cDNA is reported in SEQ ID NO: 74. 

Example 23 

35 A mouse BAC library was constructed by the cloning of BamHI partially digested DNA of 

pluripotent dmbryonic stem cells, cell line ES-E14TG2a (ATCC CRL-1821) into pBeloBACn vector 
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plasmid. Approximately fifty-six thousand clones with an average inset size of 120 kb were picked 
individually :and pooled for PCR screening as described above for human BAG library screening. 
These pools; were screened with STS g34292 derived from the region of the mouse PGl transcript 
corresponding to exon6 of the human gene. The upstream and downstream primers defining this STS 
5 are: upstream amplification primer for g34292: ATTAAAACAC GTACTGACAC CA (SEQ ID NO; 

75) , and downstream amplification primer for g34292: AGTCATGGAT GGTGGATTT (SEQ ID NO: 

76) . BAG G0281H06 tested positive for hybridizing to g34292. This BAG was isolated and sequenced 
by sub-cloning into pGenDel sequencing vector. The resulting partial genomic sequence for mouse PGl 
is reported in SEQ ID NO: 73. This process was repeated and the resulting partial genomic sequences 

10 for mouse PGl is reported in SEQ ID NOs: 182 and 183. 

Other maimnalian PGl cDNA and genomic sequences can be isolated by the methods of the 
present invention. PGl genes in mammalian species have a region of at least 100, preferably 200, 
more preferably 500 nucleotides in each mammal's most abundant transcription species which has at 
least 75%, preferably 85%, more preferably 95% sequence homology to the most abundant human or 

75 mouse cDNA species (SEQ ID NO: 3). PGl proteins in manunalian species have a region of at least 
40, preferably 90, more preferably 160 amino acids in the deduced amino acid sequence of the most 
abundant PGl transcirption species which has at least 75%, preferably 85%, more preferably 95% 
sequence homology to the deduced amino acid sequence of the most abundant human or mouse 
translations Species (SEQ ID NO: 4 or 74). 

20 X. METHODS FOR GENOTYPING AN INDIVIDUAL FOR BIALLELIC MARKERS 

Methods are provided to genotype a biological sample for one or more biallelic markers of the 
present invention, all of which is performed in vitro. Such methods of genotyping comprise 
determining the identity of a nucleotide at an PGl -related biallelic marker by any method known in 
the art. These methods find use in genotyping case-control populations in association studies as well 

25 as individuals in the context of detection of alleles of biallelic markers which, are known to be 
associated with a given trait, in which case both copies of the biallelic marker present in individual's 
genome are detennined so that an individual is classified as homozygous or heterozygous for a 
particular allele. 

The^e genotyping methods can be performed nucleic acid samples derived from a single 

i 

30 individual or pooled DNA samples. 

Genotyping can be performed using similar methods as those described above for the 
identification of the biallelic markers, or using other genotyping methods such as those further 
described below. In preferred embodiments, the comparison of sequences of amplified genomic 
fragments from different individuals is used to identify new biallelic markers whereas 

35 microsequencing is used for genotyping known biallelic markers in diagnostic and association study 
applications. 
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X.A. Source of DNA for genotvping 

Any source of nucleic acids, in purified or non-purified form, can be utilized as the starting 
nucleic acid, provided it contains or is suspected of containing the specific nucleic acid sequence 
desired. DNA or RNA is extracted from cells, tissues, body fluids. As for the source of genomic 
5 DNA to be subjected to analysis, any test sample can be foreseen without any particular limitation. 
These test samples include biological samples, which can be tested by the methods of the present 
invention described herein, and include human and animal body fluids such as whole blood, serum, 
plasma, cerebrospinal fluid, urine, lymph fluids, and various external secretions of the respiratory, 
intestinal and genitourinary tracts, tears, saliva, milk, white blood cells, myelomas and the like; 

10 biological fluids such as cell culture supematants; fixed tissue specimens including tumor and non- 
tumor tissue and lymph node tissues; bone marrow aspirates and fixed cell specimens. The preferred 
source of genomic DNA used in the present invention is from peripheral venous blood of each donor. 
Techniques to prepare genomic DNA from biological samples are well known to the skilled 
technician. While nucleic acids for use in the genotypmg methods of the invention can be derived 

15 from any mammalian source, the test subjects and individuals from which nucleic acid samples are 
taken are generally understood to be human. 

X.B. Amplificatio n Of DNA Fragments Comprising Biallelic Markers 

Methods and polynucleotides are provided to amplify a segment of nucleotides comprising 
one or more biallelic marker of the present invention. It will be appreciated that anq)lification of 

20 DNA fragments comprising biallelic markers is used in various methods and for various purposes and 
is not restricted to genotyping. Nevertheless, many genotyping methods, although not all, require the 
previous amplification of the DNA region carrying the biallelic marker of interest. Such methods 
specifically increase the concentration or total number of sequences that span the biallelic marker or 
include that site and sequences located either distal or proximal to it. Diagnostic assays may also rely 

25 on amplification of DNA segments carrying a biallelic marker of the present invention. 

Ampilification of DNA is achieved by any method known in the art. The established PGR 
(polymerase; chain reaction) method or by developments thereof or alternatives. Amplification 
methods which can be utilized herein include but are not limited to Ligase Chain Reaction (LCR) as 
described injEP A 320 308 and EP A 439 182, Gap LCR (Wolcott, M.J., Clin. Mcrobiol. Rev. 5:370- 

30 386), the so-called "NASBA" or "3SR" technique described in Guatelli J.C. et al. (Proc, Natl Acad. 
Sci. USA 87:1874-1878, 1990) and in Compton J. {Nature 350:91-92, 1991), Q-beta amplification as 
described in European Patent Application no 4544610, strand displacement amplification as described 
in Walker et al. (Clin. Chem. 42:9-13. 1996) and EP A 684 315 and. target mediated ampUfication as 
described in PCT Publication WO 9322461. 

35 LCR and Gap LCR are exponential amplification techniques, both depend on DNA ligase to 

join adjacent primers annealed to a DNA molecule. In Ligase Chain Reaction (LCR), probe pairs are 
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used which include two primary (first and second) and two secondary (diird and fourth) probes, all of 
which are employed in molar excess to target. The first probe hybridizes to a first segment of the 
target strand and the second probe hybridizes to a second segment of the target strand, the first and 
second segments being contiguous so that the primary probes abut one another in 5' phosphate- 
5 3'hydroxyl relationship, and so that a ligase can covalently fuse or ligate the two probes into a fused 
product. In addition, a third (secondary) probe can hybridize to a portion of the first probe and a 
fourth (secondary) probe can hybridize to a portion of the second probe in a similar abutting fashion. 
Of course, if the target is initially double stranded, the secondary probes also will hybridize to the 
target complement in the first instance. Once the ligated strand of primary probes is separated from 

10 the target strand, it will hybridize with the third and fourth probes which c^ be ligated to form a 
complementary, secondary ligated product. It is important to realize that the hgated products are 
functionally equivalent to either the target or its complement. By repeated cycles of hybridization and 
ligation, amphfication of the target sequence is achieved. A method for multiplex LCR has also been 
described (WO 9320227). Gap LCR (GLCR) is a version of LCR where the probes are not adjacent 

15 but are separated by 2 to 3 bases. 

For amplification of mRNAs, it is within the scope of the present invention to reverse 
transcribe mRNA into cDNA followed by polymerase chain reaction (RT-PCR); or, to use a single 
enzyme for both steps as described in U.S. Patent No. 5.322,770 or, to use Asymmetric Gap LCR 
(RT-AGLCR) as described by Marshall R.L. et al. {PCR Methods and Applications 4:80-84, 1994). 

20 AGLCR is a modification of GLCR that allows the amplification of RNA. 

Some of these amplification methods are particularly suited for the detection of single 
nucleotide polymorphisms and allow the simultaneous amplification of a target sequence and the 
identification of the polymorphic nucleotide as it is further described in X.C. 

The PCR technology is the preferred amplification technique used in the present invention. A 

25 variety of PCR techniques are famiUar to those skilled in the art. For a review of PCR technology, see 
Molecular Cloning to Genetic Engineering White, B.A. Ed. in Methods in Molecular Biology 67: 
Humana Press, Totowa (1997) and the publication entitled "PCR Methods and Applications" (1991, 
Cold Spring Harbor Laboratory Press). In each of these PCR procedures, PCR primers on either side of 
the nucleic acid sequences to be amplified are added to a suitably prepared nucleic acid sample along 

30 with dNTPs and a thermostable polymerase such as Taq polymerase, Pfu polymerase, or Vent 
polymerase. The nucleic acid in the sample is denatured and the PCR primers are specifically hybridized 
to complementary nucleic acid sequences in the sample. The hybridized primers are extended. 
Thereafter, another cycle of denauiration, hybridization, and extension is initiated. The cycles are 
repeated multiple times to produce an amplified fragment containing the nucleic acid sequence between 

35 the primer sites. PCR has further been described in several patents including US Patents 4,683,195, 
4,683,202 and 4,965,188. 
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The identification of biallelic markers as described above allows the design of appropriate 
oligonucleotides, which can be used as primers to amplify DNA fragments comprising the biallelic 
markers of the present invention. Amplification can be performed using the primers initially used to 
discover new biallelic markers which are described herein or any set of primers allowing the 
5 amplification of a DNA fragment comprising a biallelic marker of the present invention. Primers can 
be prepared by any suitable method. As for example, direct chemical synthesis by a method such as 
the phosphodiester method of Narang S.A. et al. (Methods Enzymoi 68:90-98, 1979), the 
phosphodiester method of Brown E.L. et al. (Methods Enzymoi. 68:109-151, 1979), the 
diethylphosphoramidite method of Beaucage et al. (Tetrahedron Lett, 22:1859-1862, 1981) and the 

10 solid support method described in EP 0 707 592. 

In some embodiments the present invention provides primers for amplifying a DNA fragment 
containing one or more biallelic markers of the present invention. It will be appreciated that the 
amplification primers listed in the present specification are merely exemplary and that any other set of 
primers which produce amplification products containing one or more biallelic markers of the present 

75 invention. 

The primers are selected to be substantially complementary to the different strands of each 
specific sequence to be amplified. The length of the primers of the present invention can range from 
8 to 100 nucleotides, preferably from 8 to 50, 8 to 30 or more preferably 8 to 25 nucleotides. Shorter 
primers tend to lack specificity for a target nucleic acid sequence and generally require cooler 

20 temperatures to form sufficiently stable hybrid complexes with the template. Longer primers are 
expensive to' produce and can sometimes self-hybridize to form hairpin structures. The formation of 
stable hybrids depends on the melting temperature ™ of the DNA. The Tm depends on the length of 
the primer, the ionic strength of the solution and the G+C content. The higher the G+C content of the 
primer, the higher is the melting temperature because G:C pairs are held by three H bonds whereas 

25 A:T pairs have only two. The G+C content of the amplification primers of the present invention 
preferably ranges between 10 and 75 %, more preferably between 35 and 60 %, and most preferably 
between 40 and 55 %. The appropriate length for primers under a particular set of assay conditions is 
empirically determined by one of skill in the art. 

The spacing of the primers determines the length of the segment to be amplified. In the 

30 context of the present invention amplified segments carrying biallelic markers can range in size from 
at least about 25 bp to 35 kbp. Amplification fragments from 25-3000 bp are typical, fragments from 

i 

50-1000 bp are preferred and fragments from 100-600 bp are highly preferred. It will be appreciated 
that amplififcation primers for the biallelic markers is any sequence which allow the specific 
amplification of any DNA fragment carrying the markers. Amplification primers is labeled or 
35 immobilizedi on a solid support as described in Section II. 

X.C. Methods of Genotvping DNA samples for Biallelic Markers 
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Any .method known in the art can be used to identify the nucleotide present at a biallelic 
marker site. ' Since the biallelic marker allele to be detected has been identified and specified in the 
present invention, detection will prove routine for one of ordinary skill in the art by employing any of 
a number of techniques. Many genotyping methods require the previous amplification of the DNA 
5 region carrying the biallelic marker of interest. While the amplification of target or signal is often 
preferred at present, ultrasensitive detection methods which do not require amplification are also 
encompassed by the present genotyping methods. Methods well-known to those skilled in the art that 
can be used to detect biallelic polymorphisms include methods such as, conventional dot blot 
analyzes, single strand conformational polymorphism analysis (SSCP) described by Orita et al. (Proc. 

10 Natl. Acad, ScL U.SA 86:27776-2770,^989), denaturing gradient gel electrophoresis (DGGE), ^ 
heteroduplex analysis, mismatch cleavage detection, and other conventional techniques as described 
in Sheffield, V.C. et al. (Proc. Natl Acad. Sci. USA 49:699-706, 1991), White et al. {Genomics 
12:301-306, 1992), Grompe, M. et al. (Proc, Natl Acad. ScL t/5A 86:5855-5892, 1989) and Grompe, 
M. (Nature Genetics 5:111-117, 1993). Another method for determining the identity of the nucleotide 

75 present at a particular polymorphic site employs a specialized exonuclease-resistant nucleotide 
derivative as described in US patent 4,656,127. 

Preferred methods involve directly determining the identity of the nucleotide present at a 
biallelic marker site by sequencing assay, allele-specific amplification assay, or hybridization assay. 
The following is a description of some preferred methods. A highly preferred method is the 

20 microsequencing technique. The term "sequencing assay" is used herein to refer to polymerase 
extension of duplex primer/template complexes and includes both traditional sequencing and 
microsequencing. 
1) Sequencing assays 

The nucleotide present at a polymorphic site can be determined by sequencing methods. In a 
25 preferred embodiment, DNA samples are subjected to PGR amplification before sequencing as 
described above. Methods for sequencing DNA using either the dideoxy-mediated method (Sanger 
method) or tfie Maxam-Gilbert method are widely known to those of ordinary skill in the art. Such 
methods arelfor example disclosed in Maniatis et al. (Molecular Cloning, A Laboratory Manual, Cold 
Spring Harbor Press, Second Edition, 1989). Alternative approaches include hybridization to high- 
30 density DNA probe arrays as described in Chee el al. (Science 274, 610, 1996). 

Preferably, the amplified DNA is subjected to automated dideoxy terminator sequencing 
reactions using a dye-primer cycle sequencing protocol. The products of the sequencing reactions are 
run on sequencing gels and the sequences are determined using gel image analysis. 

The polymorphism detection in a pooled sample is based on the presence of superimposed 
35 peaks in the electrophoresis pattern resulting from different bases occurring at the same position. 
Because each dideoxy terminator is labeled with a different fluorescent molecule, the two peaks 



wo 99/32644 PCT/IB98/02133 

110 

corresponding to a biallelic site present distinct colors corresponding to two different nucleotides at 
the same position on the sequence. However, the presence of two peaks can be an artifact due to 
background noise. To exclude such an artifact, the two DNA strands are sequenced and a comparison 
between the peaks is carried out. In order to be registered as a polymorphic sequence, the 
5 polymorphism has to be detected on both strands. 

The above procedure permits those amplification products, which contain biallelic markers to 
be identified. The detection limit for the frequency of biallelic polymorphisms detected by 
sequencing pools of 100 individuals is approximately 0.1 for the minor allele, as verified by 
sequencing pools of known allelic frequencies. 
10 Microsequencing assays 

In microsequencing methods, the nucleotide at a polymorphic site in a target DNA is detected by 
a single nucleotide primer extension reaction. This method involves appropriate nucrosequencing 
primers which, hybridize just upstream of the polymorphic base of interest in the target nucleic acid. 
A polymerase is used to specifically extend the 3' end of the primer with one single ddNTP (chain 
75 terminator) complementary to the nucleotide at the polymorphic site. Next the identity of the 
incorporated nucleotide is determined in any suitable way. 

Typically, microsequencing reactions are carried out using fluorescent ddNTPs and the extended 
microsequencing primers are analyzed by electrophoresis on ABI 377 sequencing machines to 
determine the identity of the incorporated nucleotide as described in EP 412 883. Alternatively 
20 capillary electrophoresis can be used in order to process a higher number of assays simultaneously. 

Different approaches can be used to detect the nucleotide added to the microsequencing 
primer. A homogeneous phase detection method based on fluorescence resonance energy transfer has 
been described by Chen and Kwok (Nucleic Acids Research 25:347-353 1997) and Chen et al. (Proc. 
NatL Acad. Sci. USA 94/20 10756-10761,1997). In this method amplified genomic DNA fragments 
25 containing polymorphic sites are incubated with a 5'-fluorescein-labeled primer in the presence of 
allelic dye-labeled dideoxyribonucleoside triphosphates and a modified Taq polymerase. The dye- 
labeled primer is extended one base by the dye-terminator specific for the allele present on the 
template. At the end of the genotyping reaction, the fluorescence intensities of the two dyes in the 
reaction mixture are analyzed directly without separation or purification. All these steps can be 
30 performed in the same tube and the fluorescence changes can be monitored in real time. 
Alternatively, the extended primer is analyzed by MALDI-TOF Mass Spectrometry. The base at the 
polymorphic site is identified by the mass added onto the microsequencing primer (see Haff L.A. and 
Smimov LP., Genome Research, 7:378-388, 1997). 

Microsequencing is achieved by the established microsequencing method or by developments 
35 or derivatives thereof Alternative methods include several solid-phase microsequencing techniques. 
The basic microsequencing protocol is the same as described previously, except that the method is 
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conducted as a heterogenous phase assay, in which the primer or the target molecule is immobilized 
or captured onto a solid support. To simplify the primer separation and the terminal nucleotide 
addition analysis, oligonucleotides are attached to solid supports or are modified in such ways that 
permit affinity separation as well as polymerase extension. The 5' ends and internal nucleotides of 
5 synthetic oligonucleotides can be modified in a number of different ways to permit different affinity 
separation approaches, e.g., biotinylation. K a single affinity group is used on the oligonucleotides, 
the oligonucleotides can be separated from the incorporated terminator regent. This eliminates the 
need of physical or size separation. More than one oligonucleotide can be separated from the 
terminator reagent and analyzed simultaneously if more than one affinity group is used. This permits 
10 the analysis of several nucleic acid speci6^ or more nucleic acid sequence information per extension ^' 
reaction. The affinity group need not be on the priming oligonucleotide but could alternatively be 
present on the template. For example, immobilization can be carried out via an interaction between 
biotinylated DNA and streptavidin-coated microtitration wells or avidin-coated polystyrene particles. 
In the same manner oligonucleotides or templates is attached to a solid support in a high-density 
75 format. In such solid phase microsequencing reactions, incorporated ddNTPs can be radiolabeled 
(Syvanen, Clinica Chimica Acta 226:225-236, 1994) or linked to fluorescein (Livak and Hainer, 
Human Mutation 3:379-385,1994). The detection of radiolabeled ddNTPs can be achieved through 
scintillation-based techniques. The detection of fluorescein-linked ddNTPs can be based on the 
binding of antifluorescein antibody conjugated with alkaline phosphatase, followed by incubation 
20 with a chromogenic substrate (such as p-nitrophenyl phosphate). Other possible reporter-detection 
pairs include: ddNTP linked to dinitrophenyl (DNP) and anti-DNP alkaline phosphatase conjugate 
(Harju et al.J Clin. Chem. 39/11 2282-2287, 1993) or biotinylated ddNTP and horseradish peroxidase- 
conjugated streptavidin with o-phenylenediamine as a substrate (WO 92/15712). As yet another 
alternative solid-phase microsequencing procedure, Nyren et al. {Analytical Biochemistry 208:171- 
25 175, 1993) described a method relying on the detection of DNA polymerase activity by an enzymatic 
luminometric inorganic pyrophosphate detection assay (ELIDA). 

Pastinen et al. {Genome research 7:606-614, 1997) describe a method for multiplex detection 
of single nucleotide polymorphism in which the solid phase minisequencing principle is applied to an 
oligonucleotide array format. High-density arrays of DNA probes attached to a solid support (DNA 
30 chips) are further described in X.C.5 . 

In one aspect the present invention provides polynucleotides and methods to genotype one or 
more biallelic markers of the present invention by performing a microsequencing assay. It will be 
appreciated that any primer having a 3' end immediately adjacent to the polymorphic nucleotide is 
used. Howeyer, polynucleotides comprising at least 8, 12, 15, 20, 25, or 30 consecutive nucleotides of 
35 the sequence immediately adjacent to the biallelic marker and having a 3' terminus immediately 



wo 99/32644 PCT/IB98/02133 

112 

upstream of the corresponding biallelic marker are well suited for detennming the identity of a 
nucleotide at biallelic marker site. 

Similarly, it will be appreciated that microsequencing analysis is performed for any biallelic 
marker or any combination of biallelic markers of the present invention. 
5 Mismatch detection assays based on polymerases and ligases 

In one aspect the present invention provides polynucleotides and methods to determine the 
allele of one or more biallelic markers of the present invention in a biological sample, by mismatch 
detection assays based on polymerases and/or ligases. These assays are based on the specificity of 
polymerases and ligases. Polymerization reactions places particularly stringent requirements on 
10 correct base pairing of the 3* end of the amplification primer and the joining of two oligonucleotides 
hybridized to a target DNA sequence is quite sensitive to mismatches close to the ligation site, 
especially at the 3' end. Methods, primers and various parameters to amplify DNA fragments 
conappsing biallelic markers of the present invention are further described above in X.B. 
Allele specific amplification 
15 Discrimination between the two alleles of a biallelic marker can also be achieved by allele 

specific amplification, a selective strategy, whereby one of the alleles is amplified without 
amplification of the other allele. This is accomphshed by placing the polymorphic base at the 3' end 
of one of the amplification primers. Because the extension forms from the 3 'end of the primer, a 
mismatch at or near this position has an inhibitory effect on amplification. Therefore, under 
20 appropriate amplification conditions, these primers only direct amplification on their complementary 
allele. Designing the appropriate allele-specific primer and the conesponding assay conditions are 
well with the ordinary skill in the art. 
Ligation/ampllfication based methods 

The "Oligonucleotide Ligation Assay" (OLA) uses two oligonucleotides which are designed 
25 to be capable of hybridizing to abutting sequences of a single strand of a target molecules. One of the 
oligonucleotides is biotinylated, and the other is detectably labeled. If the precise complementary 
sequence is found in a target molecule, the oligonucleotides will hybridize such that their termini 
abut, and create a ligation substrate that can be captured and detected. OLA is capable of detecting 
single nucleotide polymorphisms and is advantageously combined with PGR as described by 
30 Nickerson D.A. et al. {Proc, Natl Acad. ScL USA, 87:8923-8927, 1990). In this method, PGR is used 
to achieve the exponential amplification of target DNA, which is then detected using OLA. 

Other methods which are particularly suited for the detection of single nucleotide 
polymorphism include LCR (ligase chain reaction). Gap LCR (GLGR) which are described above in 
X.B. As mentioned above LCR uses two pairs of probes to exponentially amplify a specific target. 
35 The sequences of each pair of oligonucleotides, is selected to permit the pair to hybridize to abutting 
sequences of the same strand of the target. Such hybridization forms a substrate for a template- 
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dependant ligase. In accordance with the present invention, LCR can be performed with 
oligonucleotides having the proximal and distal sequences of the same strand of a biallelic marker 
site. In one embodiment, either oligonucleotide will be designed to include the biallelic marker site. 
In such an embodiment, the reaction conditions are selected such that the oligonucleotides can be 

5 ligated together only if the target molecule either contains or lacks the specific nucleotide that is 
complementary to the biallelic marker on the oligonucleotide. In an alternative embodiment, the 
oligonucleotides will not include the biallelic marker, such that when they hybridize to the target 
molecule, a "gap" is created as described in WO 90/01069. This gap is then "filled" with 
complementary dNTPs (as mediated by DNA polymerase), or by an additional pair of 

10 oligonucleotides. Thus at the end of 6ach cycle, each single strand has a complement capable of* 
serving as a target during the next cycle and exponential allele-specific amplification of the desired 
sequence is obtained. 

Ligase/Polymerase-mediated Genetic Bit Analysis™ is another method for determining the 
identity of a nucleotide at a preselected site in a nucleic acid molecule (WO 95/21271). This method 
15 involves the incorporation of a nucleoside triphosphate that is complementary to the nucleotide 
present at the preselected site onto the terminus of a primer molecule, and their subsequent ligation to 
a second oligonucleotide. The reaction is monitored by detecting a specific label attached to the 
reaction's solid phase or by detection in solution. 
2) Hybridization assay methods 
20 A preferred method of determining the identity of the nucleotide present at a biallelic marker 

site involves nucleic acid hybridization. The hybridization probes, which can be conveniently used in 
such reactions, preferably include the probes defined herein. Any hybridization assay is used 
including Southern hybridization. Northern hybridization, dot blot hybridization and solid-phase 
hybridization (see Sambrook et al., Molecular Cloning - A Laboratory Manual, Second Edition, Cold 
25 Spring Harbor Press. N.Y.. 1989). 

Hybridization refers to the formation of a duplex structure by two single stranded nucleic 
acids due to; complementary base pauing. Hybridization can occur between exactly complementary 
nucleic acid strands or between nucleic acid strands that contain minor regions of mismatch. Specific 
probes can be designed that hybridize to one form of a biallelic marker and not to the other and 
30 therefore dx6 able to discriminate between different allelic forms. Allele-specific probes are often 
used in pairs, one member of a pair showing perfect match to a target sequence containing the original 
allele and the other showing a perfect match to the target sequence containing the alternative allele. 
Hybridization conditions should be sufficiently stringent that there is a significant difference in 
hybridization intensity between alleles, and preferably an essentially binary response, whereby a 
35 probe hybridizes to only one of the alleles. Stringent, sequence specific hybridization conditions, 
under which a probe will hybridize only to the exactly complementary target sequence are well known 
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in the art (Sambrook et al. Molecular Cloning - A Laboratory Manual, Second Edition, Cold Spring 
Harbor Press, N.Y., 1989). Stringent conditions are sequence dependent and will be different in 
different circumstances. Generally, stringent conditions are selected to be about 5°C lower than the 
thermal melting point ™ for the specific sequence at a defined ionic strength and pH. By way of 
5 example and not limitation, procedures using conditions of high stringency are as follows: 
Prehybridization of filters containing DNA is carried out for 8 h to overnight at 65°C in buffer 
composed of 6X SSC, 50 mM Tris-HCl (pH7.5), 1 mM EDTA, 0.02% PVP, 0.02% Ficoll. 0.02% 
BSA, and 500 jig/ml denatured salmon sperm DNA. Filters are hybridized for 48 h at 65'*C, the 
preferred hybridization temperature, in prehybridization mixture containing 100 |ig/ml denatured 

10 salmon sperm DNA and 5-20 X 10* cpm of ^^P-labeled prober' Alternatively, the hybridizati&n step 
can be performed at 65 °C in the presence of SSC buffer, 1 x SSC corresponding to 0.15M NaCl and 
0.05 M Na citrate. Subsequently, filter washes can be done at 3TC for 1 h in a solution containing 
2X SSC, 0.01% PVP, 0,01% Ficoll, and 0.01% BSA, followed by a wash in O.IX SSC at 50°C for 45 
min. Alternatively, filter washes can be performed in a solution containing 2 x SSC and 0.1% SDS, 

15 or 0.5 x SSC and 0.1% SDS, or 0.1 x SSC and 0.1% SDS at 68°C for 15 minute intervals. Following 
the wash steps, the hybridized probes are detectable by autoradiography. By way of example and not 
limitation, procedures using conditions of intermediate stringency are as follows: Filters containing 
DNA are prehybridized, and then hybridized at a temperature of 60°C in the presence of a 5 x SSC 
buffer and labeled probe. Subsequently, filters washes are performed in a solution containing 2x SSC 

20 at 50'*C and the hybridized probes are detectable by autoradiography. Other conditions of high and 
intermediate stringency which is used are well known in the art and as cited in Sambrook et al. 
(Molecular Cloning - A Laboratory Manual, Second Edition, Cold Spring Harbor Press, N.Y., 1989) 
and Ausubel et al. (Current Protocols in Molecular Biology, Green Publishing Associates and Wiley 
Interscience, N.Y., 1989). 

25 Although such hybridizations can be performed in solution, it is preferred to employ a solid- 

phase hybridization assay. The target DNA comprising a biallelic marker of the present invention is 
amplified prior to the hybridization reaction. The presence of a specific allele in the sample is 
determined by detecting the presence or the absence of stable hybrid duplexes formed between the 
probe and the target DNA. The detection of hybrid duplexes can be carried out by a number of 

30 methods. Various detection assay formats are well known which utilize detectable labels bound to 
either the target or the probe to enable detection of the hybrid duplexes. Typically, hybridization 
duplexes are separated from unhybridized nucleic acids and the labels bound to the duplexes are then 
detected. Those skilled in the art will recognize that wash steps is employed to wash away excess 
target DNA or probe. Standard heterogeneous assay formats are suitable for detecting the hybrids 

35 using the labels present on the primers and probes. 
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Two recently developed assays allow hybridization-based allele discrimination with no need 
for separations or washes (see Landegren U. et al., Genome Research^ 8:769-776,1998). The TaqMan 
assay takes advantage of the 5* nuclease activity of Taq DNA polymerase to digest a DNA probe 
annealed specifically to the accumulating amplification product. TaqMan probes are labeled with a 
5 donor-acceptor dye pair that interacts via fluorescence energy transfer. Cleavage of the TaqMan 
probe by the advancing polymerase during amplification dissociates the donor dye from the quenching 
acceptor dye, greatly increasing the donor fluorescence. All reagents necessary to detect two allelic 
variants can be assembled at the beginning of the reaction and the results are monitored in real time 
(see Livak et al.. Nature Genetics, 9:341-342, 1995). In an alternative homogeneous hybridization 
10 based procediire, molecular beacons are u^ed for allele discriminations.'* Molecular beacons are 
hairpin-shaped oligonucleotide probes that report the presence of specific nucleic acids in 
homogeneous solutions. When diey bind to their targets they undergo a conformational 
reorganization that restores the fluorescence of an internally quenched fluorophore (Tyagi et al.. 
Nature Biotechnology, 16:49-53, 1998). 
15 The polynucleotides provided herein can be used in hybridization assays for the detection of 

biallelic marker alleles in biological samples. These probes are characterized in that they preferably 
comprise between 8 and 50 nucleotides, and in that they are sufficiently complementary to a sequence 
comprising a biallelic marker of the present invention to hybridize thereto and preferably sufficiently 
specific to be able to discriminate the targeted sequence for only one nucleotide variation. The GC 
20 content in the probes of the invention usually ranges between 10 and 75 %, preferably between 35 and 
60 %, and more preferably between 40 and 55 %. The length of these probes can range from 10, 15, 
20, or 30 to at least 100 nucleotides, preferably from 10 to 50, more preferably from 18 to 35 
nucleotides. A particularly preferred probe is 25 nucleotides in length. Preferably the biallelic 
marker is within 4 nucleotides of the center of the polynucleotide probe. In particularly preferred 
25 probes the biallelic marker is at the center of said polynucleotide. Shorter probes may lack specificity 
for a target nucleic acid sequence and generally require cooler temperatures to form sufficiently stable 
hybrid complexes with the template. Longer probes are expensive to produce and can sometimes self- 
hybridize to form hairpin structures. Methods for the synthesis of oligonucleotide probes have been 
described above and can be applied to the probes of the present invention. 
30 Preferably the probes of the present invention are labeled or immobilized on a solid support. 

Labels and solid supports are further described in II. Detection probes are generally nucleic acid 
sequences or uncharged nucleic acid analogs such as, for example peptide nucleic acids which are 
disclosed in International Patent Application WO 92/20702, morpholmo analogs which are described 
in U.S. Patents Numbered 5,185,444; 5,034,506 and 5,142,047. The probe may have to be rendered 
35 "non-extendable" in that additional dNTPs cannot be added to the probe. In and of themselves 
analogs usually are non-extendable and nucleic acid probes can be rendered non-extendable by 
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modifying the 3' end of the probe such that the hydroxyl group is no longer capable of participating in 
elongation. Por example, the 3' end of the probe can be functionalized with the capture or detection 
label to thereby consume or otherwise block the hydroxyl group. Alternatively, the 3' hydroxyl group 
simply can be cleaved, replaced or modified, U.S. Patent Application Serial No. 07/049,061 filed 
5 April 19, 1993 describes modifications, which can be used to render a probe non-extendable. 

The probes of the present invention are useful for a number of purposes. They can be used in 
Southern hybridization to genomic DNA or Northern hybridization to mRNA. The probes can also be 
used to detect PCR amplification products. By assaying the hybridization to an allele specific probe, 
one can detect the presence or absence of a biallelic marker allele in a given sample, 

10 ' High-Throughput parallePhybridizations in array format are specifically encompassed Within 
"hybridization assays" and are described below. 
Hybridization to addressable arrays of oligonucleotides 

Hybridization assays based on oligonucleotide arrays rely on the differences in hybridization 
stability of short oligonucleotides to perfectly matched and mismatched target sequence variants. 

75 Efficient access to polymorphism information is obtained through a basic structure comprising high- 
density arrays of oligonucleotide probes attached to a solid support (the chip) at selected positions. 
Each DNA dhip can contain thousands to millions of individual synthetic DNA probes arranged in a 
grid-like pattern and miniaturized to the size of a dime. 

The chip technology has already been applied with success in numerous cases. For example, 

20 the screening of mutations has been undertaken in the BRCAl gene, in 5. cerevisiae mutant strains, 
and in the protease gene of HIV-1 virus (Hacia et al., Nature Genetics, 14(4):44 1-447, 1996; 
Shoemaker et al, Nature Genetics, 14(4):450-456, 1996; Kozal et al., Nature Medicine, 2:753-759, 
1996). Chips of various formats for use in detecting biallelic polymorphisms can be produced on a 
customized basis by Affymetrix (GeneChip'^")> Hyseq (HyChip and HyGnostics), and Protogene 

25 Laboratories. 

In general, these methods employ arrays of oligonucleotide probes that are complementary to 
target nucleic acid sequence segments from an individual which, target sequences include a 
polymorphic, marker. EP785280 describes a tiling strategy for the detection of single nucleotide 
polymorphisms. Briefly, arrays may generally be "tiled" for a large number of specific 

20 polymorphisms. By "tiling" is generally meant the synthesis of a defined set of oligonucleotide probes 
which is made up of a sequence complementary to the target sequence of interest, as well as 
preselected variations of that sequence, e.g., substitution of one or more given positions with one or 
more members of the basis set of monomers, i.e. nucleotides. Tiling strategies are further described in 
PCT application No. WO 95/11995. In a particular aspect, arrays are tiled for a number of specific, 

35 identified biallelic marker sequences. In particular the array is tiled to include a number of detection 
blocks, each detection block being specific for a specific biallelic marker or a set of biallelic markers. 
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For example, a detection block is tiled to include a number of probes, which span the sequence 
segment that includes a specific polymorphism. To ensure probes that are complementary to each 
allele, the probes are synthesized in pairs differing at the biallelic marker. In addition to the probes 
differing at the polymorphic base, monosubstituted probes are also generally tiled within the detection 
5 block. These monosubstituted probes have bases at and up to a certain number of bases in either 
direction from the polymorphism, substituted with the remaining nucleotides (selected from A, T, G, 
C and U). Typically the probes in a tiled detection block will include substitutions of the sequence 
positions up to and including those that are 5 bases away from the biallelic marker. The 
monosubstituted probes provide internal controls for the tiled array, to distinguish actual hybridization 

10 from artefactual cross-hybridization. Upon dbmpletion of hybridization wi* the target sequence and 
washing of the array, the array is scanned to determine the position on the array to which the target 
sequence hybridizes. The hybridization data from the scanned array is then analyzed to identify 
which allele or alleles of the biallelic marker are present in the sample. Hybridization and scanning is 
carried out as described in PCT application No. WO 92/10092 and WO 95/11995 and US patent No. 

15 5.424,186. 

5) Integrated Systems 

Another technique, which is used to analyze polymorphisms, includes multicomponent 
integrated systems, which miniaturize and compartmentalize processes such as PGR and capillary 
electrophoresis reactions in a single functional device. An example of such technique is disclosed in 
20 US patent 5,589.136. which describes the integration of PGR amplification and capillary 
electrophoresis in chips. 

Integrated systems can be envisaged mainly when microfluidic systems are used. These 
systems comprise a pattern of microchannels designed onto a glass, silicon, quartz, or plastic wafer 
included on a microchip. The movements of the samples are controlled by electric, electroosmotic or 
25 hydrostatic forces applied across different areas of the microchip to create functional microscopic 
valves and pumps with no moving parts. Varying the voltage controls the liquid flow at intersections 
between the micro-machined channels and changes the liquid flow rate for pumping across different 
sections of the microchip. 

For genotyping biallelic markers, the microfluidic system may integrate nucleic acid 
30 amplification, microsequencing, capillary electrophoresis and a detection method such as laser- 
induced fluorescence detection. 

XI. METHODS OF GENETIC ANALYSIS USING THE BIALLELIC MARKERS OF THE 
PRESENT INVENTION 

The methods available for the genetic analysis of complex traits fall into different categories 
35 (see Lander and Schork, Science, 265, 2037-2048, 1994). In general, the biallelic markers of the 
present invention find use in any method known in the art to demonstrate a statistically significant 
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correlation between a genotype and a phenotype. The biallelic markers is used in linkage analysis and 
in allele-sharing methods. Preferably, the biallelic markers of the present invention are used to 
identify genes associated with detectable traits using association studies, an approach which does not 
require the use of affected families and which permits the identification of genes associated with 
5 complex and sporadic traits. 

The genetic analysis using the biallelic markers of the present invention is conducted on any 
scale. The whole set of biallelic markers of the present invention or any subset of biallelic markers of 
the present invention is used. In some embodiments, any additional set of genetic markers including a 
biallelic marker of the present invention is used. As mentioned above, it should be noted that the 
10 biallelic markers of the present invenfion is included in any complete or partial genetic map of tttt 
human genome. These different uses are specifically contemplated in the present invention and 
claims. 

XI.A. Linkage Analysis 

Until recently, the identification of genes linked with detectable traits has mainly relied on a 

75 statistical approach called linkage analysis. Linkage analysis involves proposing a model to explain 
the inheritance pattern of phenotypes and genotypes observed in a pedigree. Linkage analysis is 
based upon establishing a correlation between the transmission of genetic markers and that of a 
specific trait throughout generations within a family. In this approach, all members of a series of 
affected families are genotyped with a few hundred markers, typically microsatellite markers, which 

20 are distributed at an average density of one every 10 Mb. By comparing genotypes in all family 
members, one can attribute sets of alleles to parental haploid genomes (haplotyping or phase 
determination). The origin of recombined fragments is then determined in the offspring of all 
families. Those that co-segregate with the trait are tracked. After pooling data from all faniilies, 
statistical methods are used to determine the likelihood that the marker and the trait are segregating 

25 independently in all families. As a result of the statistical analysis, one or several regions having a 
high probability of harboring a gene linked to the trait are selected as candidates for further analysis. 
The result of linkage analysis is considered as significant (i.e. there is a high probability that the 
region contains a gene involved in a detectable trait) when the chance of independent segregation of 
the marker and the trait is lower than 1 in 1000 (expressed as a LOD score > 3). Generally, the length 

30 of the candidate region identified as having a LOD score of greater than 3 using linkage analysis is 
between 2 and 20Mb. Once a candidate region is identified as described above, analysis of 
recombinant individuals using additional markers allows further delineation of the candidate region. 
Linkage analysis studies have generally relied on the use of a maximum of 5,000 microsatellite 
markers, thus limiting the maximum theoretical attainable resolution of linkage analysis to about 600 

35 kb on average. 
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Linkage analysis has been successfully applied to naap simple genetic traits that show clear 
Mendelian inheritance patterns and which have a high penetrance (i.e., the ratio between the number 
of trait positive carriers of allele a and the total number of a carriers in the population). About 100 
pathological trait-causing genes were discovered using linkage analysis over the last 10 years. In 
5 most of these cases, the majority of affected individuals had affected relatives and the detectable trait 
was rare in the general population (frequencies less than 0.1%). In about 10 cases, such as 
Alzheimer's Disease, breast cancer, and Type II diabetes, the detectable trait was more conimon but 
the allele associated with the detectable trait was rare in the affected population. Thus, the alleles 
associated with these traits were not responsible for the trait in all sporadic cases. 
10 Linkage analysis suffers from a variety of drawbacks. First, linkage analysis'is limited by its 

reliance on the choice of a genetic model suitable for each studied trait. Furthermore, as already 
mentioned, the resolution attainable using linkage analysis is limited, and complementary studies are 
required to refine the analysis of the typical 2Mb to 20Mb regions initially identified through linkage 
analysis. In addition, linkage analysis approaches have proven difficult when applied to complex 
15 genetic traits, such as those due to the combined action of multiple genes and/or environmental 
factors. In such cases, too large an effort and cost are needed to recruit the adequate number of 
affected families required for applying linkage analysis to these situations, as recently discussed by 
Risch, N. and Merikangas, K. {Science, 273:1516-1517, 1996). Finally, linkage analysis cannot be 
applied to the study of traits for which no large informative families are available. Typically, this will 
20 be the case in any attempt to identify trait-causing alleles involved in sporadic cases, such as alleles 
associated with positive or negative responses to drug treatment. 
XI.B. Allele-Sharing methods 

Whereas linkage analysis involves proposing a model to explain the inheritance pattern of 
phenotypes and genotypes in a pedigree, allele-sharing methods are not based on constructing a 
25 model, but rather on rejecting a model (see Lander and Schork, Science, 265, 2037-2048. 1994). 
More specifically, one tries to prove that the inheritance pattern of a chromosomal region is not 
consistent with random Mendelian segregation by showing that affected relatives inherit identical 
copies of the region more often than expected by chance. Because allele-sharing methods are 
nonparametric (that is, assume no model for the inheritance of the trait), they tend to be more useful 
30 for the analysis of complex traits than linkage analysis. Affected relatives should show excess allele 
sharing even in the presence of incomplete penetrance and polygenic inheritance. AUele-Sharing 
methods involve studying affected relatives in a pedigree to determine how often a particular copy of 
a chromosomal region is shared identical-by-descent (EBD), that is, is inherited from a conunon 
ancestor within the pedigree. The frequency of IBD sharing at a locus can then be compared with 
35 random expectation. Affected sib pair analysis is a well-known special case and is the simplest form 
of this method. 
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However, as allele-sharing methods analyze affected relatives, they tend to be of limited value 
in the genetic analysis of drug responses or in the analysis of side effects to treatments. This type of 
analysis is impractical in such cases due to the lack of availability of familial cases. In fact, the 
likelihood of having more than one individual in a family being exposed to the same drug at the same 
5 time is very low. 

XI.C. Association Studies 

The present invention comprises methods for identifying one or several genes among a set of 
candidate genes that are associated with a detectable trait using the biallelic markers of the present 
invention. In one embodiment the present invention comprises methods to detect an association 

10 between a biallelic marker allele or a biallelic fnarker haplotype and a trait. 'Further, the invention 
comprises methods to identify a trait causing allele in linkage disequilibrium with any biallelic marker 
allele of the present invention. 

As described above, alternative approaches can be employed to perform association studies: 
genome-wide association studies, candidate region association studies and candidate gene association 

75 studies. In a preferred embodiment, the biallelic markers of the present invention are used to perform 
candidate gene association studies. The candidate gene analysis clearly provides a short-cut approach 
to the identification of genes and gene polymorphisms related to a particular trait when some 
information concerning the biology of the trait is available. Further, the biallelic markers of the 
present invention is incorporated in any map of genetic markers of the human genome in order to 

20 perform genome-wide association studies. Methods to generate a high-density map of biallelic 
markers has been described in US Provisional Patent application serial number 60/082,614. The 
biallelic markers of the present invention may further be incorporated in any map of a specific 
candidate region of the genome (a specific chromosome or a specific chromosomal region for 
example). 

25 As mentioned above, association studies is conducted within the general population and are 

not limited to studies performed on related individuals in affected families. Linkage disequilibrium 
and association studies are extremely valuable as they permit the analysis of sporadic or multifactor 
traits. Moreover, association studies represent a powerful method for fine-scale mapping enabling 
much finer mapping of trait causing alleles than linkage studies. Studies based on pedigrees often 

30 only narrow the location of the trait causing allele. Association studies and Linkage Disequilibrium 
mapping methods using the biallelic markers of the present invention can therefore be used to refine 
the location of a trait causing allele in a candidate region identified by Linkage Analysis or by Allele- 
Sharing methods. Moreover, once a chromosome segment of interest has been identified, the 
presence of a candidate gene such as a candidate gene of the present invention, in the region of 

35 interest can provide a shortcut to the identification of the trait causing allele. Biallelic markers of the 
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present invention can be used to demonstrate that a candidate gene is associated with a trait. Such 
uses are specifically contemplated in the present invention and claims. 
1) Case-c ntrol populations (inclusion criteria) 

Association studies do not concern familial inheritance and do not involve the analysis of 
5 large family pedigrees but compare the prevalence of a particular genetic marker, or a set of markers, 
in case-control populations. They are case-control studies based on comparison of unrelated case 
(affected or trait positive) individuals and unrelated control (random or unaffected or trait negative) 
individuals. The control group is composed of individuals chosen randomly or of unaffected (trait 
negative) individuals, preferably the control group is composed of unaffected or trait negative 
10 individuals. Further, the control group is preferably both "ethnically- and age-matched ^o the case 
population. In the following "trait positive population", "case population" and "affected population" 
are used interchangeably. 

An important step in the dissection of complex traits using association studies is the choice of 
case-control populations (see Lander and Schork, Science, 265, 2037-2048, 1994). Narrowing the 
15 definition of the disease and restricting the patient population to extreme phenotypes allows one to 
work with a trait that is more nearly Mendelian in its inheritance pattern and more likely to be 
homogeneous (patients suffer from the disease for the same genetic reasons). Therefore, a major step 
in the choice of case-control populations is the clinical defmition of a given trait or phenotype. Four 
criteria are often useful: clinical phenotype, age at onset, family history and severity. Preferably, in 
20 order to perform efficient and significant association studies, such as those described herein, the trait 
under study should preferably follow a bimodal distribution in the population under study, presenting 
two clear non-overlapping phenotypes (trait positive and trait negative). Nevertheless, even in the 
absence of such bimodal distribution (as may in fact be the case for more complex genetic traits), any 
genetic trait may still be analyzed by the association method proposed here by carefully selecting the 
25 individuals to be included in the trait positive and trait negative phenotypic groups. The selection 
procedure involves selecting individuals at opposite ends of the non-bimodal phenotype spectra of the 
trait under study, so as to include in these trait positive and trait negative populations individuals 
which clearly represent extreme, preferably non-overlapping phenotypes. This is particularly useful 
for continuous or quantitative traits (such as blood pressure for example). Selection of individuals at 
30 extreme ends of the trait distribution increases the ability to analyze these complex traits. The 
definition of the inclusion criteria for the case-control populations is an important aspect of 
association studies. The selection of those drastically different but relatively uniform phenotypes 
enables efficient comparisons in association studies and the possible detection of marked differences 
at the genetic level, provided that the sample sizes of the populations under study are significant 
35 enough. 
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Preferably, case-control populations to be included in association studies such as those 
proposed in the present invention consist of phenotypically homogeneous populations of individuals 
each representing 100% of the corresponding phenotype if the trait distribution is bimodal. If the trait 
distribution is non-bimodal, trait positive and trait negative populations consist of phenotypically 
5 uniform populations of individuals representing each between 1 md 98%, preferably between 1 and 
80%, more preferably between 1 and 50%, and more preferably between 1 and 30%, most preferably 
between 1 and 20% of the total population under study, and selected among individuals exhibiting 
non-overlapping phenotypes. In some embodiments, the trait positive and trait negative groups 
consist of individuals exhibiting the extreme phenotypes within the studied population. The clearer 
10 the difference between the two trait pheno'types, the greater the probability of detecting an association 
with biallelic markers. 

In preferred embodiments, a first group of between 50 and 300 trait positive individuals, 

preferably about 100 individuals, are recruited according to their phenotypes. A similar number of 

trait negative individuals are included in such studies. 
15 In the present invention, typical examples of inclusion criteria include a diagnosis of cancer or 

prostate cancer or the evaluation of the response to anti-cancer or anti-prostate cancer agent or side 

effects to treatment with anti-cancer or anti-prostate cancer agents. 

Suitable examples of association studies using biallelic markers including the biallelic 

markers of the present invention, are studies involving the following populations: 
20 a case population suffering from a form of cancer and a healthy unaffected control population, or 

a case population suffering from a form of prostate cancer and a healthy unaffected control 

population, or 

a case population treated with anticancer agents suffering from side-effects resulting from the 
treatment and a control population treated with the same agents showing no side-effects, or 
25 a case population treated with anti-prostate cancer agents suffering from side-effects resulting from 
the treatment and a control population treated with the same agents showing no side-effects, or 
a case population treated with anti-cancer agents showing a beneficial response and a control 
population treated with same agents showing no beneficial response, or 

a case population treated with anti-prostate cancer agents showing a beneficial response and a control 
30 population treated with same agents showing no beneficial response. 

2) Determining the frequency of an allele in case-control populations 

Allelic frequencies of the biallelic markers in each of the populations can be determined using 

one of the methods described above under the in Section X. under the heading "Methods for 

genotyping an individual for biallelic markers", or any genotyping procedure suitable for this intended 
35 purpose. The frequency of a biallelic marker allele in a population can be determined by genotyping 

pooled samples or individual samples. One way to reduce the number of genotypings required is to 
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use pooled samples. A major obstacle in using pooled san^)les is in terms of accuracy and 
reproducibility for determining accurate DNA concentrations in setting up the pools. Genotyping 
individual san^les provides higher sensitivity, reproducibility and accuracy and; is the preferred 
method used in the present invention. Preferably, each individual is genotyped separately and simple 
5 gene counting is applied to determine the frequency of an allele of a biallelic marker or of a genotype 
in a given population. 

3) Determining the frequency of a haplotype in case-contro] populations 

The gametic phase of haplotypes is usually unknown when diploid individuals are 
heterozygous at more than one locus. Different strategies for inferring haplotypes is used to partially 
10 overcome this difficulty (see Exboffier L. and Slatkin M., Mol Biol EvoL, 12(5): 921-927,-'1995). 
One possibility is that the multiple-site heterozygous diploids can be eliminated from the analysis, 
keeping only the homozygotes and the single-site heterozygote individuals, but this approach might 
lead to a possible bias in the sample composition and the underestimation of low-frequency 
haplotypes. Another possibility is that single chromosomes can be studied independently, for 
75 example, by asymmetric PGR amplification (see Newton et al., Nucleic Acids Res., 17:2503-2516, 
1989; Wu et al, Proc. Natl. Acad. ScL USA, 86:2757, 1989) or by isolation of single chromosome by 
limit dilution followed by PGR amplification (see Ruano et al, Proc. Natl Acad, Sci, USA, 87:6296- 
6300, 1990). Further, multiple haplotypes can sometimes be inferred using genealogical information 
in families (Perlin et al.. Am. J. Hum. Genet. 55:777-787, 1994). A sample is haplotyped for 
20 sufficiently close biallelic markers by double PGR amplification of specific alleles (Sarkar, G. and 
Sommer S.S., Biotechniques, 1991). These approaches are not entirely satisfying either because of 
their technical complexity, the additional cost they entail, their lack of generalization at a large scale, 
or the possible biases they introduce. To overcome these difficulties, an algorithm based on Hardy- 
Weinberg equilibrium (random mating) to infer the phase of PCR-amplified DNA genotypes 
25 introduced by Clark A.G. {Mol. Biol. Evol, 7:111-122, 1990) is used. Briefly, the principle is to start 
filling a preliminary list of haplotypes present in the sample by examining unambiguous individuals, 
that is, the complete homozygotes and the single-site heterozygotes. Then other individuals in the 
same sample are screened for the possible occurrence of previously recognized haplotypes. For each 
positive identification, the complementary haplotype is added to the list of recognized haplotypes, 
30 until the phase information for all individuals is either resolved or identified as unresolved. This 
method assigns a single haplotype to each multiheterozygous individual, whereas several haplotypes 
are possible when there are more than one heterozygous site. Any other method known in the art to 
determine the frequency of a haplotype in a population is used. Preferably, an expectation- 
maximization (EM) algorithm (Dempster et al., /. R. Stat. Soc, 39B: 1-38, 1977) leading to maximum- 
35 likelihood estimates of haplotype frequencies under the assumption of Hardy- Weinberg proportions is 
used (see Excoffier L. and Slatkin M., Mol Biol EvoL 12(5): 921-927, 1995). The EM algorithm is 
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used to estimate haplotype frequencies in the case when only genotype data from unrelated 
individuals are available. The EM algorithm is a generalized iterative maximum-likelihood approach 
to estimation that is useful when data are ambiguous and/or incomplete. The EM algorithm is used to 
resolve heterozygotes into haplotypes. Haplotype estimations are further described below under the 

5 heading "Statistical methods". 

4) Genetic Analysis based on Linkage Disequilibrium 

Linkage disequilibrium is the non-random association of alleles at two or more loci and 
represents a powerful tool for genetic mapping of complex traits (see Jorde L.B., Am. /. Hum. Genet., 
56:1 1-14, 1995). Biallelic markers, because they are densely spaced in the human genome and can be 

10 genotyped in large numbers, are particularly useful in genetic analysis based on linkage ^ 
disequilibrium. 

When a disease mutation is first introduced into a population (by a new mutation or the 
immigration of a mutation carrier), it necessarily resides on a single chromosome and thus on a single 
"background" or "ancestral" haplotype of linked markers. Consequently, there is complete 
15 disequilibrium between these markers and the disease mutation: one finds the disease mutation only in 
the presence of a specific set of marker alleles. Through subsequent generations recombinations 
occur between the disease mutation and these marker polymorphisms, and the disequilibrium 
gradually dissipates. The pace of this dissipation is a function of the recombination frequency, so the 
markers closest to the disease gene will manifest higher levels of disequilibrium than those that are 
20 further away. When not broken up by recombination, "ancestral" haplotypes and linkage 
disequilibrium between marker alleles at different loci can be tracked not only through pedigrees but 
also through populations. 

The pattern or curve of disequilibrium between disease and marker loci will exhibit a single 
maximum that occurs at the disease locus. Consequently, the amoimt of linkage disequilibrium 
25 between a disease allele and closely linked genetic markers may yield valuable information regarding 
the location of the disease gene. For fine-scale mapping of a disease locus, it is useful to have some 
knowledge of the patterns of linkage disequilibrium that exist between markers in the studied region. 
As mentioned above the mapping resolution achieved through the analysis of linkage disequilibrium 
is much higher than that of linkage studies. The high density of biallelic markers combined with 
30 linkage disequilibrium analysis provide powerful tools for fine-scale mapping. Different methods to 
calculate linkage disequilibrium are described below under the heading "Statistical Methods". 
Moreover, association studies as a method of mapping genetic traits rely on the phenomenon of 
linkage disequilibrium. 
3) Associati n studies 

35 As mentioned above, the occurrence of pairs of specific alleles at different loci on the same 

chromosome is not random, and the deviation from random is called linkage disequilibrium. If a 
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specific allele in a given gene is directly involved in causing a particular trait, its frequency will be 
statistically increased in an affected (trait positive) population when compared to the frequency in a 
trait negative population or in a random control population. As a consequence of the existence of 
linkage disequilibrium, the frequency of all other alleles present in the haplotype carrying the trait- 
5 causing allele will also be increased in trait positive individuals compared to trait negative individuals 
or random controls. Therefore, association between the trait and any allele (specifically a biallelic 
marker allele) in linkage disequilibrium with the trait-causing allele will suffice to suggest the 
presence of a trait-related gene in that particular allele's region. Association studies focus on 
population frequencies. Case-control populations can be genotyped for biallelic markers to identify 

10 associations that narrowly locate a trait causing allele. Moreover, any marker in litikage 
disequilibrium with one given marker associated with a trait will be associated with the trait. Linkage 
disequilibrium allows the relative frequencies in case-control populations of a limited number of 
genetic polymorphisms (specifically biallelic markers) to be analyzed as an alternative to screening all 
possible functional polymorphisms in order to find trait-causing alleles. Association studies compare 

75 the frequency of marker alleles in unrelated case-control populations, and represent powerful tools for 
the dissection of complex traits. 
Association analysis 

The general strategy to perform association studies using biallelic markers derived from a 
candidate gene is to scan two groups of individuals (case-control populations) in order to measure and 
20 statistically compare the allele frequencies of the biallelic markers of the present invention in both 
groups. 

If a statistically significant association with a trait is identified for at least one or more of the 
analyzed biallelic markers, one can assume that: either the associated allele is direcdy responsible for 
causing the trait (the associated allele is the trait causing allele), or more likely the associated allele is 

25 in linkage disequilibrium with the trait causing allele. The specific characteristics of the associated 
allele with respect to the candidate gene function usually gives further insight into the relationship 
between the associated allele and the trait (causal or in linkage disequilibrium). If the evidence 
indicates that the associated allele within die candidate gene is most probably not the trait causing 
allele but is in linkage disequilibrium with the real trait causing allele, then the trait causing allele can 

30 be found by sequencing the vicinity of the associated marker. 

Association studies are usually run in two successive steps. In a first phase, the frequencies of 
a reduced number of biallelic markers from one or several candidate genes are determined in the trait 
positive and trait negative populations. In a second phase of the analysis, the identity of the candidate 
gene and die position of the genetic loci responsible for the given trait is further refined using a higher 

35 density of makers from the relevant gene. However, if the candidate gene under study is relatively 
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small in length, as it is the case for many of the candidate genes analyzed included in the present 
invention, a single phase is sufficient to establish significant associations. 
Haplotype analysis 

As described above, when a chromosome carrying a disease allele first appears in a 
5 population as a result of either mutation or migration, the mutant allele necessarily resides on a 
chromosome having a unique set of linked markers: the ancestral haplotype. This haplotype can be 
tracked through populations and its statistical association with a given trait can be analyzed. The 
statistical power of association studies is increased by complementing single point (allelic) 
association studies with multi-point association studies also called haplotype studies. Thus, a 
10 haplotype association study allows one to define the frequency and the type of the ancestral carrier 
haplotype. A haplotype analysis is important in that it increases the statistical significance of an 
analysis involving individual markers. Indeed, by performing an association study with a set of 
biallelic markers, it increases the value of the results obtained through the study, allowing false 
positive and/or negative data that may result from the single marker studies to be eliminated. 
75 In a first stage of a haplotype frequency analysis, the frequency of the possible haplotypes 

based on various combinations of the identified biallelic markers of the invention is determined. The 
haplotype frequency is then compared for distinct populations of trait positive and control individuals. 
The number of trait positive individuals which should be subjected to this analysis to obtain 
statistically significant results usually ranges between 30 and 300, with a preferred number of 
20 individuals ranging between 50 and 150. The same considerations apply to the number of random 
control or unaffected individuals used in the study. The results of this first analysis provide haplotype 
frequencies in case-control populations, the relative risk for an individual carrying a given haplotype 
of being affected with the given trait under study and the estimated p value for each evaluated 
haplotype. 
25 Interaction Analysis 

The biallelic markers of the present invention may also be used to identify patterns of biallelic 
markers associated with detectable traits resulting from polygenic interactions. The analysis of 
genetic interaction between alleles at unlinked loci requires individual genotyping using the 
techniques described herein. The analysis of allelic interaction among a selected set of biallelic 
30 markers with appropriate level of statistical significance can be considered as a haplotype analysis, 
similar to those described in further details within the present invention. Preferably, genotyping 
typing is performed using the microsequencing technique. 

Methods to test for association between a trait and a biallelic marker allele or a haplotype of 
biallelic marker alleles are described below. 
35 XLD. Statistical m thods 
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In general, any method known in the art to test whether a trait and a genotype show a 
statistically significant correlation is used. 
Methods to estimate haplotype frequencies in a population 

As described above, when genotypes are scored, it is often not possible to distinguish 

5 heterozygotes so that haplotype frequencies cannot be easily inferred. When the gametic phase is not 
known, haplotype frequencies can be estimated from the multilocus genotypic data. Any method 
known to person skilled in the art can be used to estimate haplotype frequencies (see Lange K., 
Mathematical and Statistical Methods for Genetic Analysis, Springer, New York, 1997; Weir, B.S., 
Genetic data Analysis II: Methods for Discrete population genetic Data, Sinauer Assoc, Inc., 

10 Sunderland, MA, USA, 1996) Preferably, ni^imum-likelihood haplotype frequencies are computed 
using an Expectation- Maximization (EM) algorithm (see Dempster et al., J. R. Stat. Soc, 39B:l-38, 
1977; Excoffier L. and Slatkin M., Mol Biol Evol, 12(5): 921-927, 1995). This procedure is an 
iterative process aiming at obtaining maximum-likelihood estimates of haplotype frequencies from 
multi-ipcus genotype data when the gametic phase is unknown. Haplotype estimations are usually 

15 performed by applying the EM algorithm using for example the EM-HAPLO program (Hawley M.E. 
et al., Am. /, Phys. AnthropoL, 18:104, 1994) or the Arlequin program (Schneider et al., Arlequin: a 
software for population genetics data analysis. University of Geneva, 1997). The EM algorithm is a 
generalized iterative maximum likelihood approach to estimation and is briefly described below. 

In the following part of this text, phenotypes will refer to multi-locus genotypes with 

20 unknown phase. Genotypes will refer to known-phase multi-locus genotypes. 

Suppose a sample of N unrelated individuals typed for K markers. The data observed are the 
unknown-phase K-locus phenotypes that can categorized in F different phenotypes. Suppose that we 
have H underlying possible haplotypes (in case of K biallelic markers, H=2^). 
For phenotype j, suppose that cj genotypes are possible. We thus have the following equation 

25 P/ = X prigenotypCi ) = X prQik ,hi ) pguation 1 

^ 1=1 1=1 

where Pj is the probability of the phenotype hk and hi are the two haplotypes constituent the 

genotype i. Under the Hardy- Weinberg equilibrium, pr(hk,hl) becomes : 

prihf^.hi)- pr{hj,)^ if hk - hi, pr(hi^, hi) = 2pr{hk).pr (hi) if hj^ ^hi. Equation 2 
The successive steps of the E-M algorithm can be described as follows: 

30 Starting with initial values of the of haplotypes frequencies, noted, pf^^ , p^^ , Pj^ . 

these initial values serve to estimate the genotype frequencies (Expectation step) and then estimate 

another set of haplotype frequencies (Maxioiization step): p^^ ,P2^ , p^^ . 

these two steps are iterated until change in the sets of haplotypes frequency are very small. 
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A stop criterion can be that the maxunum difference between haplotype frequencies between 
two iterations is less than 10'^. These values can be adjusted according to the desired precision of 
estimations. 

In detail, at a given iteration s, the Expectation step consists in calculating the genotypes 
5 frequencies by the following equation: 

pr {genotype i)^^"^ = pr(phenotype j).pr (genotype i\phenotype j)^^^ 

_nj pr(h^,hi)^^^ Equation 3 

where genotype i occurs in phenotype j, and where hk and hi constitute genotype L Each probability 
are derived according to equations 1 and 2 above. 

Then the Maxiniization step simply estimates another set of haplotype frequencies given the 
10 genotypes frequencies. This approach is also known as gene-counting method (Smith, Ann. Hum. 
Gen^r., 21:254-276, 1957). 

^1 I ^^^^^^^^^^^^(s) Equation 4 

2 J=U=1 

where 5^ is an indicator variable which count the number of time haplotype t in genotype /. It takes 
the values of 0, 1 or 2. 

15 To ensure that the estimation finally obtained are the maximum-likelihood estimations several 

values of departures are required. The estimations obtained are compared and if they differ the 
estimations leading to the best likelihood are kept. The term "haplotype determination method" is 
used to refer to all methods for determinin haplotypes known in the art including expectation- 
maximization algorithms . 

20 Methods to calculate linkage disequUibrium between markers 

A number of methods can be used to calculate linkage disequilibrium between any two 
genetic positions, in practice, linkage disequilibrium is measured by applying a statistical association 
test to haplotype data taken from a population. 

Linkage disequilibrium between any pair of biallelic markers comprising at least one of the biallelic 
25 markers of the present invention (Mi,Mj) can be calculated for every allele combination (Mii,Mji ; 
Mii,Mj2 ; Mi2,Mji and Mi2,Mj2), according to the Piazza formula : 
AMik,Mj,= Ve4 . V (94 + 93) (94 -^92), where : 

94= - - = frequency of genotypes not having allele k at and not having allele 1 at Mj 
93= - + = frequency of genotypes not having allele k at Mj and having allele 1 at Mj 
30 92= + - = frequency of genotypes having allele k at Mj and not having allele 1 at Mj 
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Linkage disequilibrium (LD) between pairs of biallelic markers (Mi, Mj) can also be calculated for 
every allele combination (Mii,Mji;Mii,Mj2;Mi2,Mji andMi2,Mj2), according to the maximum-likelihood 
estimate (MLE) for delta (the composite linkage disequilibrium coefficient), as described by Weir 
(B.S. Weir,' Genetic Data Analysis, Sinauer Ass. Eds, 1996). This formula allows Unkage 
5 disequilibrium between alleles to be estimated when only genotype, and not haplotype, data are 
available. This LD composite test makes no assumption for random mating in the sampled population, 
and thus seems to be more appropriate than other LD tests for genotypic data. 

Another means of calculating the linkage disequilibrium between markers is as follows. For a 
couple of biallelic markers. Mi (a/bi) and Mj {a/bj), fitting the Hardy- Weinberg equilibrium, one can 
10 estimate the four possible haplotype frequencies in a given population according to the approach 
described above. 



=pr(haplotype{ai,aj))-pr(aiypri^ ^ 

Where pr(ai) '\% the probability of allele ai and aj is the probability of allele aj, and where 
15 pr(haplotype (ai, aj)) is estimated as in Equation 3 above. 

For a couple of biallelic marker only one measure of disequilibrium is necessary to describe the 
association between Mi and Mj. 

Then a normalized value of the above is calculated as follows: 



The skilled person will readily appreciate that other LD calculation methods can be used 
without undue experimentation. 

Linkage disequilibrium among a set of biallelic markers having an adequate heterozygosity 
rate can be determined by genotyping between 50 and 1000 unrelated individuals, preferably between 
25 75 and 200, more preferably around 100. 
Testing for association 

Methods for determining the statistical significance of a correlation between a phenotype and 
a genotype, in this case an allele at a biallelic marker or a haplotype made up of such alleles, is 
determined by any statistical test known in the art and with any accepted threshold of statistical 
30 significance being required. The application of particular methods and thresholds of significance are 
well with in the skill of the ordinary practitioner of the art. 

Testing for association is performed by determining the frequency of a biallelic marker allele 
in case and control populations and comparing these frequencies with a statistical test to determine if 
their is a statistically significant difference in frequency which would indicate a correlation between 
35 the trait and the biallelic marker allele under study. Similarly, a haplotype analysis is performed by 
estimating the frequencies of all possible haplotypes for a given set of biallelic markers in case and 



The estimation of gametic disequilibrium between ai and aj is simply: 



20 



D'aiaj = Daiaj / max ( pr(ai).pr(aj),pr(bi).(bj) ) 
D'aiaj = Daiaj / max ( pr(bi).pr(aj),pr(ai).(bj) ) 



witli Daiaj<0 
with Daiaj>0 
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control populations, and comparing these frequencies with a statistical test to determine if their is a 
statistically significant correlation between the haplotype and the phenotype (trait) under study. Any 
statistical tool useful to test for a statistically significant association between a genotype and a 
phenotype is used. Preferably the statistical test employed is a chi square test with one degree of 
5 freedom. A P-value is calculated (the P-value is the probability that a statistic as large or larger than 
the observed one would occur by chance). 
Statistical significance 

In preferred embodiments, significance for diagnosis purposes, either as a positive basis for 
further diagnostic tests or as a preliminary starting point for early preventive therapy, the p value 

10 related to a biallelic liiarker association is preferably'ibout 1 x 10-2 or less, more preferably about 1 x 
10-4 or less, for a single biallelic marker analysis and about 1 x 10-3 or less, still more preferably 1 x 
10-6 or less and most preferably of about 1 x 10-8 or less, for a haplotype analysis involving several 
markers. These values are believed to be applicable to any association studies involving single or 
multiple marker combinations. 

15 The skilled person can use the range of values set forth above as a starting point in order to 

carry out association studies with biallelic markers of the present invention. In doing so, significant 
associations between the biallelic markers of the present invention and cancer and prostate cancer can 
be revealed and used for diagnosis and drug screening purposes. 

Using the method described above and evaluating the associations for single marker alleles or 

20 for haplotypes permits an estimation of the risk a corresponding carrier has to develop a given trait, 
and particularly in the context of the present invention, a disease, preferably cancer, more preferably 
prostate cancer. Significance thresholds of relative risks are to be adapted to the reference sample 
population used. 

In this regard, among all the possible marker combinations or haplotypes which are evaluated 
25 to determine the significance of their association with a given trait, for example a form of cancer or 
prostate cancer, a response to treatment with anti-cancer or anti-prostate cancer agents or side effects 
related to treatment with anti-cancer or anti-prostate cancer agents, it is believed that those displaying 
a coefficient of relative risk above 1, preferably about 5 or more, preferably of about 7 or more are 
indicative of a "significant risk" for the individuals carrying the identified haplotype to develop the 
30 given trait. It is difficult to evaluate accurately quantified boundaries for the so-called "significant 
risk". Indeed, and as it has been demonstrated previously, several traits observed in a given 
population are multifactorial in that they are not only the result of a single genetic predisposition but 
also of other factors such as environmental factors or the presence of further, apparently unrelated, 
haplotype associations. Thus, the evaluation of a significant risk must take these parameters into 
35 consideration in order to, in a certain manner, weigh the potential importance of external parameters 
in the development of a given trait. Without wishing to be bound to any invariable model or theory 



wo 99/32644 PCT/IB98/02133 

131 

based on the above statistical analyses, the inventors believe that a "significant risk" to develop a 
given trait is evaluated differently depending on the trait under consideration. 

It will of course be understood by practitioners skilled in the treatment or diagnosis of cancer 
and prostate cancer that the present invention does not intend to provide an absolute identification of 

5 individuals who could be at risk of developing a particular form of cancer or who will or will not 
respond or exhibit side effects to treatment with anti-cancer or anti-prostate cancer agents but rather to 
indicate a certain degree or likelihood of developing a disease or of observing in a given individual a 
response or a side effect to treatment with a particular agent or set of agents. 

However, this information is extremely valuable as it can, in certain circumstances, be used to 

10 initiate preventive treatments or to allow an individual carrying a significant haplotype to foresee 
warning signs such as minor symptoms. In the case of cancer, the knowledge of a potential 
predisposition, even if this predisposition is not absolute, might contribute in a very significant 
rnai)ner to treatment, or allow for suggestions in changes in diet or the reduction of risky behaviors, 
e.g. smoking. Similarly, a diagnosed predisposition to a potential side effect could immediately direct 

15 the physician toward a treatment, for which such side effects have not been observed during clinical 
trials. 

Phenotypic randomization 

In order to confirm the statistical significance of the first stage haplotype analysis described 
above, it might be suitable to perform further analyses in which genotyping data from case-control 

20 individuals are pooled and randomized with respect to the trait phenotype. Each individual 
genotyping data is randomly allocated to two groups which contain the same number of individuals as 
the case-control populations used to compile the data obtained in the first stage. A second stage 
haplotype analysis is preferably run on these artificial groups, preferably for the markers included in 
the haplotype of the first stage analysis showing the highest relative risk coefficient. This experiment 

25 is reiterated between 50 and 200 times, preferably between 75 and 125 times. The repeated iterations 
allow the determination of the percentage of obtained haplotypes with a significant p-value level 
below about 1x10-3. 

Example 24 
Detailed Association Studies 
30 The initial association smdies between the 8p23 locus and prostate cancer described in 

Section ID. were repeated at a higher level of sophistication. 

Collection of DNA samples from affected and non-affected individuals 
Prostate cancer patients were recruited according to clinical inclusion criteria based on 
pathological or radical prostatectomy records as described above in Section I. However, the pool of 
35 individuals suffering from prostate cancer described in Section I was augmented from the original 185 
individuals to a range of between 275 and 491 individuals depending on the marker tested. Similarly, 
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the control pool of non-diseased individuals described in Section I was augmented from the original 
104 individuals to a range of between 130 and 3 1 3 individuals depending on the marker tested. 

Genotvping Affected and Control Individuals 
As for Section I.D., allelic frequencies of the biallelic markers in each population were 
5 determined by performing microsequencing reactions on amplified fragments obtained by genomic 
PCR performed on the DNA samples from each individual as described in Example 5. 

Association Studies 

Association results were obtained using markers spanning a 650 kb region of the 8p28 locus 
around PGl iboth using single point analysis and haplotyping studies. See Figure 16. As compared 
10 with the earlier representation of the initial association reSults for this region shown in Figure 2, 
Figure 16 is to scale, since the entire region has now been sequenced. In addition, more markers were 
generated around the association peak in the area of PGl; each of which has been tested in single 
point analysis (hence the density of data within this subregion). The haplotyping curve in Figure 16 
represents, for each marker considered, the maximum p-value for haplotypes obtained using this 
75 marker and any number from all markers harbored by the same BAG and being in Hardy Weindeberg 
Disequilibrium with said marker. 

The data presented in Figure 16 shows a strong association between this specific region 
within 8p23 locus, especially in the area that has been identified as being the PGl gene, and prostate 
cancer. The maximum p-value in single point analysis, for the PGl sub-region, is 3.10'^ while outside 
20 of the PGl subregion. most of the p-values obtained for single point associations are less significant 
than 1.1 0'\ The maximum p-value obtained for haplotyping studies is the one obtained for a marker 
inside PGl' s BAG, and equals 3.10"^ 

Figure 17 is a graph showing an enlarged view of the single point association results within a 
160 kb region comprising the PGl gene. Markers involved in this enlargement were all located on 
25 BAG B0463F01 (see Figure 16), except marker 4-14, which lies in very close proximity, on BAG 
B0189E08. Figure 17 shows all of the markers which made up the maximum haplotype shown in 
Figure 16. Some of these markers were later revealed to lie within the promoter, exonic or intronic 
regions of the PGl gene. The markers outside the gene were all informative biallelic markers with a 
least frequent allele present at a frequency of more than 20%, while markers within the gene were a 
30 mix of such informative markers and markers whose least frequent allele's frequency is less than 
20%. These data confum and nanow the previous peak of association values seen in Figure 16, to a 
40 kb harboring the PGl gene. Significant associations are obtained for markers starting at the 
promoter site with marker No. 99-1485, and ending at the 3' UTR site with marker No. 5-66. 

Figure 18A is a graph showing an enlarged view of the single point association results of 40 
35 kb within the PGl gene. These data confirm that seven markers within the PGl gene have one allele 
associated with prostate cancer, with p-values all similar and more significant than 1.10'^ , specifically 
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markers 99-622 ; 4-77 ; 4-71 ; 4-73 ; 99-598 ; 99-576 ; 4-66. Figure 18B is a table listing the location 
of markers within PGl gene, the two possible alleles at each site. For each marker, the disease- 
associated allele is indicated first ; its frequencies in cases and controls as well as the difference 
between both are shown ; the odd-ratio and the p-value of each individual marker association are also 
5 shown. 

The data in Figures 17, 18A, and IBB demonstrate that the markers in the PGl gene have an 
association with prostate cancer that is valid, and exhibits similar significance values, regardless 
whether the considered cases are sporadic or familial cases. Therefore, some PGl alleles must be 
general risk factors for any type of prostate cancer, whether familial or sporadic. The fact that several 

10 p- values for associated alleles are around 1.10*^ suggests that all these markers are in linkage - 
disequilibrium to one another, and can all be used individually to assess PGl associated prostate 
cancer susceptibility risk. The prostate cancer associated alleles of the 7 markers discussed above, all 
exhibit an odd-ratio of about 1.5, which means for each of them that an individual carrying such allele 
has 1.5 more chances to be susceptible to prostate cancer than not. 

75 In order to confirm the significance of the association results found for markers on the BAG 

harboring PGl, we a novel statistical method was performed as described in provisional patent 
application serial no. 60/107,986, filed November 10, 1998. 

Haplotvpe analvsis 

The results of a haplotype analysis study using 4 markers (marker Nos. 4-14, 99-217, 4-66 and 

20 99-221) ) within the 160 kb region shown in Figure 17 are shown in Figure 19 A. These 4 markers 
have each been shown to be strongly associated with prostate cancer, i.e. with p-values more 
significant than 1.10'^ on approximately 150 cases and 130 controls. All haplotypes using 2, 3, or 4 
markers among the 4 above cited were analyzed using 491 case patients and 317 control individuals. 
Figure 19A shows the most significant haplotypes obtained, as well as the individual odd-ratios for 

25 each. Haplotype 1 1 is the most significant (p-value of ca. 3.10"*), and is related to haplotype 5, shown 
in Figure 4 in that three of the four marker alleles (4-14 C, 99-217 T and 99-221 A) are common to 
both haplotypes, and both cover a similar region. Differences in p-values are explained both by the 
addition of markers and of more case or control individuals. Haplotype 11 has an highly informative 
odd-ratio (of above 3) ; it is present in 3% of the controls and almost 10% of the cases. 

30 Figure 19B is a table showing the segmented haplotyping results according to the age of the 

subjects, and whether the prostate cancer cases were sporadic or familial, using the same markers 4 
markers and the same individuals as were used to generate the results in Figure 19 A. Figure 19B 
shows equivalent results for all segments of the population analyzed, demonstrating that the PGl 
associated alleles are general risk factors for prostate cancer, regardless of the age of onset of the 

35 disease. 
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The haplotyping results and odd ratios for all of the combinations of the 7 markers (99-622; 4- 
77; 4,71; 4-7 ; 99-598; 99-576; and 4-66) within PGl gene that were shown in Figure 18 to have p- 
values more significant than 1 x 10'^ were computed. A portion of these data are shown in Figure 20. 
All of the 2-, 3-, 4-, 5-, 6- and 7-marker haplotypes were tested. Figure 20 identifies for each x-marker 

5 haplotype category, the most significant haplotype. Among all these, the most significant haplotype is 
the two-marker haplotype 1, which shows a p-value of approximately 6.10*^ with an odd ratio of 2. 
The frequency of haplotype I among the control individuals is 15%, while it is 26% among the case 
patients. It is worth noting that these frequencies are very similar for all haplotypes presented on 
Figure 20. It will thus be sufficient to test this two marker haplotype for prognosis/diagnosis on risk 
10 patients, as opposed to having a more complex test of a haplotype comprising 3 or more makers. 

Finally, Figure 21 is a graph showing the distribution of statistical significance, as measured 
by Chi-square values, for each series of possible x-marker haplotypes, (x =2, 3 or 4) using all of the 
19 markers found in PGl gene. These da.ta confirm that testing 2-marker haplotypes within PGl is 
sufficient because the testing 3- or 4-marker haplotypes does not increase the statistical relevance of 

15 the analysis. 

Example 25 
Attributable Risk 

Attributable risk describes the proportion of individuals in a population exhibiting a 
phenotype due to exposure to a particular factor. For further discussion of attributable risk values, see 
20 Holland, Bart K,, Probability without Equations - Concepts for Clinicians) The Johns Hopkins 
University Press, pp. 88-90. In the present case the phenotype examined was prostate cancer, and the 
exposure was either one single allele of an individual PGl-related marker, or a haplotype thereof in an 
individual's genome. 

The formula used for calculating attributable risk values in the present study was the following: 
25 AR = Pe (RR-1) / [Pe (RR-1)+11. where: 

AR was the attributable risk of allele or haplotype ; 

Pe was the frequency of exposure to allele or haplotype within the population at large, in the 
present study a random male Caucasian population ; and 

RR was the relative risk, in the present study relative risk is approximated with the odd-ratio, 
30 because of the relatively low incidence of prostate cancer in populations at large (values for the odd 
ratios are found in Figures 18B and 20). 

In this case, Pe was estimated using a dominant transmission model for prostate cancer: 
Pe (Naa + Nab ) / N, where: 

Naa was the number of homozygous individuals harboring the disease associated allele or 
35 haplotype within a given random population, and Nab was the number of heterozygous individuals is 
said random population. Naa and Nab were calculated using the allele frequencies in the random 
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population as indicated in Figures 18B and 20, and N was the number of individuals in total random 
population. 

We calculated the attributable risks of disease-associated alleles for markers within PGl gene 
and presented these results in Figure 18B. In Figure 20, the attributable risk for the two-marker 
5 haplotypes present in the figure as shown as well. These data demonstrate that disease-associated 
alleles of PGl are present in approximately 20% of prostate cancer patients in the Caucasian 
population at large, and therefor represent prognostic tools of significant value. 

SEQUENCE LISTING FREE TEXT 
10 The fbllowing free text appears in the accompanying Sequence Listing: ^' 
identification method ProScan 
potential start codon 

exonl , 
Tyr phos 

15 upstream amplification primer 

polymorphic fragment 
polymorphic base 
downstream amplification primer 
complement 

20 upstream amplification primer 99-217-PU, extracted from SEQ IDl 34216 34234 

Klein, Kanehisa and DeLisi identification method, potential helix 
Eisenberg, Schwarz, Komarony, Wall identification method, potential helix 
Prosite match 

potential Tyrosine kinase site, Prosite match 
25 potential caseine kinase n site, Prosite match 

potential Leucine zipper site, Prosite match 
potential site, Prosite match 
potential protein kinase C, Prosite match 

potential cAMP and cGMP dependant protein kinase site, Prosite match 
30 primer oligonucleotide 

box2 from SEQ1D4, present in AF003136, P33333, P26647, U89336, U56417, 

AB005623 

box2 from Z72511 

box3 from SEQID4, present in AF003136 
35 potential microsequencing oligo 

complement potential microsequencing oligo 
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polymorphic fragment 4-77, extracted from SEQ IDl 12057 12103 
polymorphic fragment 99-123, variant version of SEQ ID21 
base A ; G in SEQ ID22 

downstream amplification primer 99-217-RP, extracted from SEQ IDl 34625 34645 
complement 

polymorphic base C in PGl (13680) SEQ EDI 
stop codon 

potential - 
amplification oligonucleotide 
sequencing oligonucleotide 
Box n 
Box in 
^ ^ Box I 

upstream amplification primer for SEQ 188, SEQ 265, SEQ 189, SEQ 266 

downstream amplification primer for SEQ 185, SEQ 262, SEQ 186. SEQ 263, SEQ 187, SEQ 

264 

microsequencing oligo for4-20-149.misl 

Although this invention has been described in terms of certain preferred embodiments, other 
embodiments which will be apparent to those of ordinary skill in the art in view of the disclosure 
herein are also within the scope of this invention. Accordingly, the scope of the invention is intended 
to be defined only by reference to the appended claims. 
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polymorphtsni 


Enost'fcie.qi:rent 


iless,freq!;ient 


P' 


q** 


P* 


q" 


99-123 


cn- 


C 


T 


0,65 


0,35 


0,7 


0,3 


4-26 


A/G 


A 


G 


0,61 


0,39 


0,55 


0,45 


4-14 


C/T 


C 


T 


0,65 


0,35 


0,59 


0,41 




C/Q- 


c 


G 


0,67 


0,33 


0,76 


0,24 


99-217. 


err 


c 


T 


0,69 


0,31 


0,77 


0,23 


4-67 • 


CfT 


c 


T 


0,74 


0,26 


0,84 


0,16 


99-213 


A/G 


A 


G 


0,55 


0,45 


0,62 


0,38 


99-221 


C/A 


C 


A • 


0,43 


0,57 


0,43 


0,57 


99-135 


A/G 


A 


G 


0,75 


0.25 


0.7 


0.3 



*: frequency of most frequent base within each sub-population 
**: frequency of least frequent base within each sub-population (p+q=1) 
standard deviations -0,023 to 0.031 for controls 
standard deviations -0,018 to 0,021 for cases 
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1. A recombinant, purified or isolated polynucleotide comprising a mammalian PGl 
gene, cDNA, complement thereof, or fragment thereof having at least 10 nucleotides in length. 

2. The polynucleotide according to claim 1, wherein said mammalian PGl gene or 
cDNA is human or mouse. 

3. The polynucleotide according to claim 2, wherein the polynucleotide is selected from 
10 SEQ ID NOs: 3, 69, 112-124, 179, and 182-184. 

4. A polynucleotide selected from SEQ ID NOs: 1 85-578. 

5. A purified or isolated polypeptide comprising a mammalian PGl protein, or fragment 
15 thereof having at least 8 amino acids in length. 

6. The polypeptide according to claim 5, wherein said mammalian PGl protein is human 
or mouse. 

20 7. The polypeptide according to claim 6, wherein said polypeptide is selected from SEQ 

ID NOs: 4, 5, 70, 74, and 125-136. 

8. The polypeptide according to claim 5, wherein said polypeptide consists of said 
mammalian PGl protein, or fragment thereof having at least 8 amino acids in length. 

25 

9. A polynucleotide comprising a nucleic acid sequence encoding a polypeptide 
according to claim 8. 

10. An antibody composition capable of selectively binding to an epitope-containing 
30 fragment of a polypeptide according to claim 8, wherein said antibody is either polyclonal or 

monoclonal. 

11. A vector comprising a polynucleotide according to any one of claims 1. 4, and 9. 



35 12. 



A host cell comprising a polynucleotide according to claim 11. 
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13 A nonhuman host animal or mammal comprising a vector according to claim 11. 

14. A mammalian host cell comprising a PGl gene disrupted by homologous 
recombination with a knock out vector. 

5 

15. A nonhuman host mammal comprising a PGl gene disrupted by homologous 
recombination with a knock out vector. 

16. A polynucleotide according to any one of claims 1, 4, and 9, further comprising a 

10 label. 

17. A polynucleotide according to any one of claims 1. 4, and 9, attached to a solid 
. suppQil. . 

15 18, A random or addressable array of polynucleotides comprising at least one 

polynucleotide according to any one of claims 1, 4, and 9. 

19. A method of determining whether an individual is at risk of developing cancer or 
prostate cancer, or whether said individual suffers from cancer or prostate cancer as a result of a 

20 mutation in the PGl gene comprising: 

obtaming a nucleic acid sample from said individual; and 

determining whether the nucleotides present at one or more PGl-related biallelic marker are 
indicative of a risk of developing cancer or prostate cancer or indicative of cancer or prostate cancer 
resulting from a mutation in the PGl gene. 

25 

20. A method of determining whether an individual is at risk of developing cancer or 
prostate cancer or whether said individual suffers from cancer or prostate cancer as a result of a 
mutation in the PGl gene comprising: 

obtaining a nucleic acid sample from said individual; and 
30 determining whether the nucleotides present at one or more PGl-related bialleUc marker are 

indicative of a risk of developing cancer or prostate cancer or indicative of cancer or prostate cancer 
resulting from a mutation in the PGl gene. 

21. A method according to either one of claims 19 and 20, wherein said PGl-related 
35 biallelic is a PGl-related biaUelic markers positioned in SEQ ID NO: 179; a PGl-related biallelic 
marker selected from the group consisting of 99-1485/251, 99-622/95, 99-619/141, 4-76/222, 4- 
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77/151, 4-71/233, 4-72/127. 4-73/134, 99-610/250. 99-609/225, 4-90/283, 99-602/258. 99-600/492, 
99-598/130, 99-217/277. 99-576/421. 4-61/269, 4-66/145, and 4-67/40; or a PGl-related biallelic 
marker selected from the group consisting of 99-622, 4-77, 4-71, 4-73. 99-598, 99-576 , and 4-66. 

22. A method of obtaining an allele of the PGl gene which is associated with a detectable 
phenotype comprising: 

obtaining a nucleic acid sample from an individual expressing said detectable phenotype; 
contacting said nucleic acid sample with an agent capable of specifically detecting a nucleic 
acid encoding the PGl protein; and 

isolating said nucleic acid encoding the PGl protein. 

23. A method of obtaining an allele of the PGl gene which is associated with a detectable 
phenotype comprising: 

obtaining a nucleic acid sample from an individual expressing said detectable phenotype; 
contacting said nucleic acid sample with an agent capable of specifically detecting a sequence 
within the 8p23 region of the human genome; 

identifying a nucleic acid encoding the PGl protein in said nucleic acid sample; and 
isolating said nucleic acid encoding the PGl protein. 

24. A method of categorizing the risk of prostate cancer in an individual comprising the 
step of assaying a sample taken from the individual to determine whether the individual carries an 
allelic variant of PGl associated with an increased risk of prostate cancer. 

25. The method of Claim 24 wherein said sample is a nucleic acid sample. 

26. The method of Claim 24 wherein said sanq)le is a protein sample. 

27. The method of Claim 26. ftirther comprising determining whether the PGl protein in 
said sample binds an antibody that binds specifically to a PGl isoform associated with prostate 
cancer. 

28. A method of genotyping comprising determining the identity of a nucleotide at a 
PGl-related biallelic marker in a biological sample. 



# 
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29. A method of estimating the frequency of an allele in a population comprising 
determining the proportional representation of a nucleotide at a PGl-related biallelic marker in a 
pooled biological sample derived from said population. 

5 30. A method of detecting an association between a genotype and a phenotype, 

comprising the steps of: 

a) genotyping at least one PGl-related biallelic marker in a trait positive population; 

b) genotyping said PGl-related biallelic marker in a control population; and 

c) determining whether a statistically significant association exists between said genotype and 
10 said phenotype. 

31 . A method of estimating the frequency of a haplotype for a set of biallelic markers in a 
population, comprising: 

a) genotyping at least one PGl-related biallelic marker; 
15 b) genotyping a second biallelic marker by determining the identity of the nucleotides at said 

second biallelic marker for both copies of said second biallelic marker present in the genome of each 
individual in said population; and 

c) applying an haplotype determination method to the identities of the nucleotides determined 
in steps a) and b) to obtain an estimate of said frequency. 

20 

32, A method of detecting an association between a haplotype and a phenotype, 
comprising the steps of: 

a) estimating the frequency of at least one haplotype in a trait positive population according to 
the method of claim 31; 

25 b) estimating the frequency of said haplotype in a control population according to the method 

of claim 31; and 

c) determining whether a statistically significant association exists between said haplotype 
and said phenotype. 

30 33. A method according to claim 31, wherein said PGl-related biallelic marker and said 

second biallelic marker are 4-77/151 and 4-66/145, 

34. A method according to claim 32, wherein said haplotype exhibits a p-value of < Ix 
10"' in an association with a trait positive population with cancer, or prostate cancer. 
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35. A method according to any one of claims 29 to 31, wherein said PGl-related biallelic 
is a PGl-related biallelic markers positioned in SEQ ID NO: 179; a PGl-related biallelic marker 
selected from the group consisting of 99-1485/251, 99-622/95, 99-619/141, 4-76/222, 4-77/151, 4- 
71/233. 4-72/127. 4-73/134, 99-610/250, 99-609/225, 4-90/283, 99-602/258, 99-600/492, 99-598/130, 

5 99-217/277, 99-576/421. 4-61/269, 4-66/145, and 4-67/40; or a PGl-related biallelic marker selected 
from the group consisting of 99-622, 4-77, 4-7 1. 4-73, 99-598, 99-576 , and 4-66. 

36. A method according to either one of claims 30 and 32, wherein said control 
population is a trait negative population or a random population. 

10 

37. A method according to any one of claims 22. 23, 30, and 32, wherein said phenotype 
is a disease, cancer or prostate cancer; a response to an anti-cancer agent or an anti-prostate cancer 
agent; or a side e;ffect to an anti-cancer or anti-prostate cancer agent. 

75 38. A polynucleotide for use m a hybridization assay for determining the identity of the 

nucleotide at an PGl-related biallelic marker; for use in a sequencing assay for determining the 
identity of the nucleotide at an PGl-related biallelic marker; for use in a allele-specific amplification 
assay for determining the identity of the nucleotide at an PGl-related biallelic marker; or for use in 
amplifying a segment of nucleotides comprising an PGl-related biallelic marker. 



PGl-related biaUelic markers positioned in SEQ ID NO: 179; a PGl-related biaUelic marker selected 
from the group consisting of 99-1485/251, 99-622/95. 99-619/141, 4-76/222, 4-77/151, 4-71/233, 4- 
72/127, 4-73/134, 99-610/250, 99-609/225, 4-90/283, 99-602/258, 99-600/492. 99-598/130, 99- 
25 217/277. 99-576/421, 4-61/269, 4-66/145, and 4-67/40; or a PGl-related biallelic marker selected 
from the group consisting of 99-622, 4-77, 4-71, 4-73, 99-598, 99-576 , and 4-66. 



20 



39. The polynucleotide according to claim 38, wherein said PGl-related biallelic is a 
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Figure 5 

HAPLOTYPE SIMULATIONS (100 ITERATIONS) 
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Figure 7 
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PGl 



AF003136 Ce 
(Genbank) 



Z72511 
(Genbank) 

P38226 
(Swissprot) 

P33333 
(Swissprot) 

Z49770 
(Genbank) 

P26647 
(Swissprot) 

Z49860 
(Genbank) 

U89336 
(Genbank) 

U56417 
(Genbank) 



boxl 

Hs NRQ 81-83 

NHQ 630 -632 



Ce 48 NHR 50 
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AB005623 Mm 
(Genbank) 



IIINHQIIJ^ 
81 NHQ 83 
116 NHQ 118 
72 NHQ 74 



Z29518 
(Genbank) 
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95 NHQ 97 
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box2 box3 

FPEGTR 760-7(55 LDAIYDVTV 277-279 

FPEGTR 772-777 LDAIYDVTV 762-770 

FPEGTD 729-754 VEYIYDITI 204-212 

FPEGTN 223-228 ESLYDITI 271-279 

FPEGTR 754-759 

FPEGTN 215-220 LDAIYDVTI 265-273 
FPEGTR 745-750 

FVEGTR 90-95 N?f<rnym\ 138-146 
FPEGTR 75«-77i 
FPEGTR 776-7S7 
FPEGTR 775-77S 

FVEGTR 770-775 VPAIYDTTV 275-225 



Hs = Homo sapiens, Ce = Caenorabibtis elegans, Ec = Escherichia coli; Sc = Saccharomyces 
cerevisiae , Bn = Brassica napus, Zm = Zea maize. Mm = Mus musculus 

- = pattern absent from protein sequence 



Note 



Functional acyl glycerol transferases all contain boxes 1 and 2 and not box 3. 
Proteins most related to PGl contain the 3 boxes with a high degree of conservation. 
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Alternative splicing 
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tggcaaacat tgcacaaaag tttacaactt cgtgactaac agtaatctgg ggtgattcac 180 
aacaaattta cacataaaca catatttact gactttatac acagcaatcc taacgtgaac 240 
acagaacctg ctttatcttt tcgcacactg ttctagtgta gagatgtctg gtctcagtta 300 
aagaaagcat aaggagcatt agttgtgcac actgtccaca cccgtgactt ttttccacca 3 60 
gtactaaacc tajgtgcttct tacagtacag ggcaatgaca gccacagaaa gagagaagct 420 
ccttttactg tgjtaatgctt cctgctggcc ttcaaatact tgttacttga gagatctcca 480 
ttcacctggc ttltgtcccca aaggtcatca tctaccaatg atgttgttat ttgatgttaa 540 
tcatgtataa agaaagtagc taccatcctg gccctgatta gaacttccca ctgaaatacc 600 
gtcctgccta aaggtagcac aggtttccat tatggtggtg gtggggaggg ggcgggaata 660 
tatatatata taltatatata tatatatatg gtaaagcatt cggcattctt ttaaagtaca 720 
actatccttg aaaagggtta catattaaac catttttacc acagccaaag gggaggagaa 780 
agatccaaaa gtcctgtgga tctgctttaa catcaataaa acagttatcc acccttcgta 840 
gcttttagtg aaggctacaa aagtatgctt tttatggatt acacatgtgc acgcaactac 900 
tttaattact acagaaaaaa acgaggctcc ttattaaaaa aaaatcagaa acaagtccaa 960 
cagactctga ggaaatgaag caagagtgaa ttctgaaaag gtctaataaa cagtatggaa 1020 
atatccttgt gggattgttc ttcagctatg cataaacatg taattatcat cattactgtg 1080 
atggggaaaa acacggaccc taattctgaa acaccctggt agcgagagac gggcaggagg 1140 
ggctgctgcg cactcagagc ggaggctgag gaggcggcgt ccccttgcaa aggactggca 1200 
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gtgagcagat ggggacactc gagctgcccc gcgacctggg ccgagctgcc tacaacctgg 1260 

gcccaggtgc ctgcaagaat tagacctccg ataacg-ttaa cacccacttt ctcactgctc 1320 

taattgtgtg catcccggcg cccaggggct tgtgagcagc aggtgcgcgt tccaggcagc 13 80 

tccagcgacc cttaaacctg accgcgcgca cgtccggccc gagggagcag aacaagaggc 1440 

acccggaccc tcctccggcc agcacccacc ttcacccagt tccgtcagtc gccaccacct 1500 

cccttcccgc gtccgcagcc ggcccagctg gggagcatgc gcagtggccg gagccgggtt 1560 

gcccgcgcca cagcaggtag ctgtactgca actgtcggcc caaaccaacc aatcaagaga 1620 

cgtgttattg ccgccgaggt ggaactatgg caacgggcga ccaatcagaa ggcgcgttgt 1680 

tgccgcggag ccccctgccc cggcaggggg atgtggcgat gggtgagggt catggggtgt 1740 

gagcatccct gagccatcga tccgggaggg ccgcgggttc ccttgctttg ccgccgggag 1800 

cggcgcacgc agccccgcac tcgcctaccc ggccccgggc ggcggcgcgg cccatgcggc 1860 

tgggggcgga ggctgggagc gggtggcggg cgcggcggcc cgggcccggg cggtgattgg 1920 

ccgcctgctg gccgcgactg aggcccggga ggcgggcggg gagcgcaggc ggagctcgct 1980 

gccgccgagc tgagaagatg ctgctgtccc tggtgctcca cacgtactcc atgcgctacc 2040 

tgctgcccag cgtcgtgctc ctgggcacgg cgcccaccta cgtgttggcc tggggggtct 2100 

ggcggctgct ctpcgccttc ctgcccgccc gcttctacca agcgct'ggac gaccggctgt 2160>' 

actgcgtcta ccagagcatg gtgctcttct tcttcgagaa ttacaccggg gtccaggtga 2220 

gccgcctccc gctcccgggt ctcggcgtcc acccgagctc ccgggggcgc ggacctctcc 2280 

gctcccccac agctggcgag ggtcacccgg ccggcccggc ggacccagca cggagagcac 2340 

gtgccgcctc cccgccttcc tctccgcatg cttcctgccg ttctgccgag atcgctctct 2400 

aggaagctgt ggctgcgtcg tcctgaggct acgagtggga cccgccgccc ctttccccgc 2460 

ccctcgcctg ggtctgatgc tgcttagcaa agtgggtgca gatgcacgtt ttaaataata 2520 

gggcacgcgt ttagcagttt ctggcctttg gtccaaagag gtggtcatgt tggaacagat 2580 

cggagacgtc tacactccga agtgcgcttt tacagtgacc tcttgaaaca gaagtacaat 2640 

tcggtcttgt gttctttcec ctggacaagt gaaagctggg cgaagaaatg aatacatttg 2700 

ttaaccgtag aagcctaact agatacaatt cttgccaact ttaactgggc ttgaatgtgt 2760 

gggtgatctg ttgtctgatt actttctttc tgttactgtt tctctgtaga gattggattc 2820 

gtagattaaa cttgagaaac aaaccataaa agtggaaggc cctctttaac agtaggtatt 2880 

tgaagtgtta taaaaaaaaa aaaggtgaat ttttctttta tttctcagtt tgaaagaaca 2940 

gctttattct tggttattcc taatgtccac ctagtcctct tttacttttc ttggtagggt 3000 

tagggtggca tggggaaatg ggacggtatc attttgtctt tttaactttt tttttttcca 3 060 

cctacagcag ctgtttttac cctgtggtca gtcaggtact atatttagtt tgcagttgca 3120 

ctgctgatcg acccttgatg gccccagttg gaagttgttt ggggggaagg aactaggaga 3180 

ggccagggcc tccatttaaa ccagtgtctg taagtgtctc cttggaagga aaaaaagata 3240 

ctgttccagg tcatggtttc ctggtagttg acgtttaaaa tgggcctcat ttaaaaattt 3300 

caataattca ggctaatttt ttccctttat atggtaactc caccaagttt gtctaaatgt 33 60 

atgattttta tcatgattaa gtttttactt ccacatcatg tgacaactgg cctgggatgg 3420 

gatataagct cagaacacaa agtcattcac ctgttaaaaa aataattcta tctgtggcgg 3480 

gttatgttat ttttgttcaa agaggacaca atatgatgca gaatacacca ttgaaggatt 3540 

ttttggtttg gcaagttctt atttttttaa atggctgtaa aacctagcag tgtttctgaa 3 600 

attgcatacc ttacctgatg ttcagagatc cgatttactt cttgatttcc cagcaagtga 3 660 

ttttgaaaac atttaatcta atcattcccc ccaccgtctg ttcaaatcaa aggaagtggc 3720 

atccagcact aa^tttcatg catttatgaa aggatgcctg aggaccctta agtataattc 3780 

aaaattttgt ttaatgtgtg ttccttgatg aagttcttta ggagtcgtag aacgaactga 3840 

ttgcccactg atcatcaaat gcaagttatg aacatttaat aaaaatttaa aaccaagagt 3900 

ttcttgttcc tgcattttta tttttattgt atggagggga caaataatta ttttctgttt 3960 

agtaacagag cagggtattt tgaatttatt agggtctttt tctgcagtct gggtttcctg 4020 

tgtacacaaa gctacctttc aatatttttt attgtttctg ttaagattaa atcaatagag 4080 

gaataaatag ctatcttcaa acataagacc caaaggaaaa agatttatag tgatgttctg 4140 

tcaccttatt ttttacctgt gactttgtac cattaacttt gtcactgaga tgttttgatt 4200 

aaaattttta gcttgctttt cttgttttgt taggacactc tttttttctt gaattgtttt 4260 

tatcagcttt cgtttgcaag gctagtgatg attctcttgt tctgtataaa gtattgttga 4320 

ctcatttctg aagggagttt tagtaattta agaggttata agtttttaaa taaaaggttt 4380 

attaatttat atatattaaa gaggcatttt aaaataaaat tttttttaaa tgacattttt 4440 

acacctttca actctaggtt taaaaaataa gtggttcaca gtagttcttg cagaagaata 4500 

ttttctttta catagaattt ttaagctgaa gagaagtagt agtaggtcca tgagatttat 4560 

gatctgtgct tggcaggtaa acctgcttcc aacaaattta gttggatttt tcttggattc 4620 

tgggtaaata cctttttctt ccccagtttc acactttat tttcatatgt atctctgaga 4680 

tagagaaata tttcagtcag tgctgctaaa attgttcctt ataactcgtt tatcctttta 4740 

ggtccttcca gaatctctca ttggtactga aactcaaatg ggtactttct tcaccattta 4800 

tttctttaga ataagtaata agaattttat aagctttttt atatttcacg taatttgaga 4860 
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ctattgaaaa 

gttactttcc 

cttaagttct 

ctccctctgg 

ggcattcctt 

ctcttctgtc 

gagcccacct 

aagagccttt 

ttttttgagg 

agtattttgt 

aagcattttg 

actttcccat 

gaacatttga 

tattcatatt 

atatatatat 

aagacaatta 

aaagccctat 

agaggccgta 

aaagttggag 

aaattctatc 

ctatcctagg 

atcgttatga 

ataggatgtc 

ttcccccatt 

taatcaacac 

ccaggctctg 

gtgatcctcc 

gaaatggggt 

tctgccttgg 

ataaatattc 

ttcagaatat 

tgtaatataa 

tgacctgctt 

attttgacat 

gacagaatct 

aacctctgcc 

caggtacaca 

gacgagattt 

cttgatccac 

agaaacagtc 

actgattcat 

tgtaaaagca 

ataagtcatc 

aaatgttgcc 

tgtatcctta 

tcttaaattg 

ttctgcattt 

tcccaatcaa 

ttactagact 

taaatataaa 

aatgcctgat 

atactttgag 

atcccagttt 

tgtgggaagg 

cgaaatatac 

aaacagcact 

tggagctgat 

ttattatggt 

tgacttttca 

tgtacattgt 

acctgccatt 



tccagttaag 
tgtgctgcca 
cgtgattctg 
aggccctagg 
ggcttggggc 
tcacatctca 
ggatattccg 
ttccaaataa 
ggctgccctt 
agttatttcg 
gtctctgctt 
aatcttttag 
ttaactgttt 
tc^gtattct 
atatatggaa 
attatgtatg 
ctataggata 
tatatatcct 
tactgacaga 
atcgtgagtt 
aatctagata 
taaataatga 
ttcaaatgtc 
cctatgcaat 
gtaatagatt 
gtgcggtggc 
ggagtagctg 
ctttctgtgt 
cctcccaaag 
taattaccga 
attttacatt 
caataaaatg 
taagtgtaaa 
ttctctaata 
cacactgttg 
ccpgggctca 
gcaccatgcc 
tgccatattg 
catgcttagc 
aaggcttttt 
caaactaatc 
gccattcatt 
tagtctactc 
cagcgtcgtc 
attttactct 
cttccttcac 
gcttatttaa 
atacaattcc 
tataggaaaa 
cctctgtgaa 
ttttctattt 
aaagttaaag 
ggtgtgtaac 
ttctttagtt 
ctgcatacct 
ccgtatatac 
gcttctcaag 
ttcttaacta 
gtaagggatc 
tccaggtgct 
atggaggtca 



tctctctact 
aaacagatca 
gaggccagca 
gggaatctgt 
cccatcactt 
ctctcccttt 
ggatgatctc 
gaaaacattc 
cattccccca 
tagtttaact 
tctttaacag 
tttcttattt 
tattttcgga 
tttaattctt 
ataactgaaa 
atgtggtgaa 
ggaagtaact 
tgagctggag 
ggattgcgta 
aacgtgaaac 
tatcctaaat 
caaatctttt 
agaattcttt 
acactgaaaa 
ggggtttggg 
accatcatgg 
ccgtgccatt 
tgcccaggct 
tgctgggatt 
tttatcttgc 
agtggctctg 
cacagttctt 
atagtgtgaa 
tgcccttaac 
cccaaaccag 
agcggtcctc 
cagctaattt 
cccagtctgg 
tgattcatac 
atctagagaa 
ctaaactcct 
agaatgaaac 
ccttttatga 
tctgatacct 
tctctgctta 
tttagctgag 
gcaggataat 
agtctaacac 
tactaaaaaa 
caaaccagtt 
taaaatcaca 
tttcccctac 
tttagatttc 
aaatgaactt 
atggggatac 
ctagtttact 
tggaatcaga 
gaggttgatg 
tctagaaccc 
gtcctgggta 
cattctagtg 



gtgttgagag 
cctcaaacta 
ctttgaaatc 
tcttgtgggt 
caacctctgc 
ctcttagaag 
ttcatctcaa 
acaggttcca 
caacaatgaa 
tgccttattt 
agaacctggt 
acagatttac 
acaaatctgc 
atctgattct 
tcttgataaa 
tatactg'gtg 
tgaatgtgga 
tttaaggaaa 
ggactcatga 
tagatttatg 
gttgagatag 
tagcatgttt 
tttctttgct 
ctgatcattg 
tttttttgag 
ctcattgcag 
atttctagct 
ggtcttgaat 
acaggtgtga 
ttaaatcagt 
actgctaatt 
aagtttatat 
aaacacaaga 
ttctccaagg 
aggtgcagtg 
ccacctcagc 
ttttttttgg 
ttttgagctc 
tcttaactga 
catttataac 
aatgagttaa 
atgtttactt 
cacttctaca 
atagtcctaa 
tttgccattc 
agtgacagga 
aaaaactttt 
aattaaattc 
atgtaactag 
atttcaggtt 
gatgcaatta 
tcctacactg 
ttccaagagc 
cttacagatc 
ctctgtgcca 
ttccctcttt 
agttaacttt 
ttagtggttg 
agatccctta 
ccaagggata 
tgggaagaca 



gcattgattc 
agcggcttaa 
aaggtgtagg 
ttcaacttct 
cttacagtcc 
gatgcttgtc 
gatccttaat 
gggcttagga 
ctccatagtt 
ctttaggtat 
tttctgtaat 
cttcacatat 
attctgtata 
gaaattacca 
ttaaaggtga 
tttggtttgt 
atgcttagag 
acttatggga 
aaaaggaatg 
ttagtttata 
ctgcataaac 
tgtgaagctg 
tcttttttaa 
aaatttgtag 
tcagggtctt 
ccttgaatgc 
aatttttaaa 
tcctggcctc 
gccaccatgc 
tggtaacact 
cccccttctc 
aaaataaaca 
aagaagataa 
attcatactt 
gtgcagtctc 
ctcctgagta 
tattttttag 
ctgggctcaa 
aacattgttc 
tggatctttc 
atttatattc 
agaattggag 
ttctttctgc 
caagaatatg 
atgtgaagac 
ctgtgtaggt 
actataggaa 
tggttaggga 
aactctattt 
gcatttgtgt 
tacattcaaa 
cgtacacctt 
ttttgagtaa 
agttttttag 
ttacgatgga 
tgtatatttg 
tcctttacta 
gaccattcaa 
attcctgcaa 
caatgtttga 
aacaataaca 



aagtacctgt 

aataatagaa 

ctcaatttta 

ggtgactggt 

ttgctgccac 

attgggttta 

tataactgca 

tgtggacaca 

ctgcctattc 

ttacgtatta 

aagtttactt 

cccttaagta 

ataaccaact 

tcttgtgatt 

tataacttct 

ttgccacfta 

actcagagta 

aattaaaagg 

aagttacctt 

gcctagaatt 

aataactgta 

ataaatgtta 

aaaatttctt 

gccaaaaaat 

cttctgtcac 

ctgggttcaa 

agtttttgta 

aggtgatcct 

ctagccccta 

tggaatttac 

caaatgctaa 

ggttttcagt 

agaatttaag 

ttttttgtaa 

cactcactgc 

gctgggacta 

tgggggtaga 

gtgatccgtc 

caagtttctc 

tttgtgtagc 

tgaatcttgc 

aagggagctt 

acttctgcca 

aatcatacct 

cttaaataga 

gtgggtgtgt 

attaaacatt 

actgcttaac 

ttacacttta 

atagtttttt 

cactgccaca 

tcctaggtac 

gtgtttgaat 

tacagtagca 

aggcacggga 

tctgattttg 

ttttctcatt 

tagtaagtaa 

tattcccgtg 

tagacaatgt 

agaaaatgaa 



4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 
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aatttactgt gccatgccag gttgtttagc ctggtgggtg 
tcttactgag caagtgacat ttgtgtggag ctctgtaaaa 
gtagtcatcc aggtgagaaa tgatggttag gggagtggaa 
aaagaattcc aaatctattt tagtggtagc tgatagggct 
aaaagaagag ggjtgggttag taacacactc agtcgcagtt 
agtattgttc tattatgtaa ataattccat ctttacaaag 
ttacagacaa ggaaaaggga acacccatgg ttcacatctg 
tcaggcactt attttctgaa gatgctctgc ctggcaatgt 
gaccccctac tttcaaggta ttcatctagg aaagacatga 
ataacactga aajttagagac gtgtttatta actttgccat 
taaagtaact ctjttgcttgg gttagtggag aaggctataa 
tttgaacatg cgtaattaac atggaatgtt tagggaaaag 
aataaacatg aggagtttga agcatggcat tcaaggtttt 
cttttccatt cgttggtttc attctagtct agcttttcct 
attagaccgc tcctctctgg aattccaact caagcccttg 
tgttacccca tctcattgtc agggtaactt ttatgtaata 
ataacattag catattttaa tgtatggatc atctcctctg 
gatggcaata atgggaagaa tgacttgatt ttactttttc 
tagtctgggc acggtgtggc tcatgcctgt aatcccagca 
tggatcactt gaggtcaggc attcgagacc agtctggcca 
taccaaaaaa atacaaacac ttactgggca tggtggtgtg 
caggaggctg aggtgggaga atcacttgaa catgggaggt 
cagagtgaga ccctgtctca aaagaaaaaa aaggtaaaag 
gctggtaatc caagcacttt gggaggctga ggcaatggat 
agatcagcct gaccaacatg gagaaacccc ttctctacta 
cgtggtggtg cctgcctgta atctaagcta catgggaggc 
aacccaggag acagaggttg tggtgagcca agatggcacc 
acaagagcga aattccgtct caaaacaaac aaacaaacaa 
cagagtactc tagggaattc tagtctgtgt ttctgtggaa 
taagggatgg agatttttga atggcataac tagttgataa 
ccaagtctag tgagtccgat tcattctttc cttaaataga 
ctccaccctc aagagtaagg cagaatgagc aaagtcagag 
cgcagccagc agjtgcagaga aaccttggtt tagttgtgaa 
aatttttgag cctatgcaat tctccaaggt tttatgttgt 
caccagaaat caaaacccca aataagaaag tgttacttga 
gtgtataagt gtaagtgata tttggaagac gactttactg 
agaattccag ggbcggaaag aaaggagggt gatggtacct 
gtcccagcca catattaagt gctaaccacc tactgttaaa 
acaaaataca tagtctctac cgtaaagtaa cacataattt 
cttaaaagaa aacttgaata tatgctgaga tagttcacaa 
actgaggaaa taaaggagga atacaactgt gtccaaatga 
tgttgcatat gtaagcaggt ggttcaccta aaagttggat 
tcttggtgca cttacatatt gcattgcttc cgggcttaat 
attttttgtt ggtttttaat tttactcctt gtaattccgt 
aaaattacat aagcttctaa tatatgagaa gtcttctcac 
tttttgcaga gagtagtttt gtcacagtca aaagattttg 
taggtgtaat tcctatttct ctgccattcc gtatgtcatc 
cagtctcaag attctcgtcc ttaaatggaa tactttttgt 
atgagataat acgtgaaact gcctagctca gtgaatggta 
aacacaccct ctaaaataag aacagtacca aaagacagga 
aaaagacaca tgcatgctga gtgtatgaga aagaactttg 
ggccatggca gttccacagc atgacgtggt tgctgtgggt 
ccccgtcact gcctggcttt gatgcttgct ttcttcagct 
tgaaggtctt gtgtgtacag tcgtgacctc acatttccaa 
cagtctacaa cgtacgagca ccagagttga cgtgagacag 
acatccttct ggaaaacact gtgtaagctt tcagtgcgaa 
ttctgttaga tgjtagtctgc aagcatcctg attttactgg 
aggcggctga tgbttccatg gatagcccac tactagtatt 
ttcttactgg aa^attgccc tgttcttatg atactgctgc 
ttcagactaa acttggagac tacagtcagt cagagaactt 
attctttcat tcictgatcat cctcaaaatt ttgaaaaaga 



agaggtaggg gtttggaaaa 8580 

gggccagctt ggaaggtaat 8540 

agagtggatg ttaagattga 8700 

ttgtgattga atgtggagga 8760 

agtgagtgct gctgtgtgca 8820 

taggcaccat tcttcctctt 8880 

tagtagccta gccaggagtt 8940 

ggttatattg gttgaaatga 9000 

actgccaatt acaatatagg 9060 

acagaggtaa agtaactctt 9120 

aaattacttg gagtttttac 9180 

aggttttcaa ttgataacat 9240 

ctaaattctg ccccggttaa 9300 

tctgggccgc ccctccccac 9360 

cttttctcca tctgtcatga 9420 

ttaacatata taatactgat 9480 

caacattgta acctcttgga 9540 

ttttaacaaa aatggtggag 9600 

ttttgggagg ccaaggaggg 9660 

acattgtgaa accccatctc 9720 

tgcctgtagt cctagctact 9780 

agaggctcca gcttgggcga 9840 

ggccaggtgc ggaggctcac 9900 

cacctgaggt cgggagttcg 9960 

aaaatacaaa attagccggg 10020 

tgaggcagga gaatcacttg 10080 

attgcactcc cgactgggca 10140 

aacaaaacag agagaaaagg 10200 

atgtatatga atctcacttt 10260 

gttttgctct aacagggtac 10320 

tgaaggagga agaaacatga 10380 

aagttaaaaa agaattctca 10440 

tcaaaaccag tactttttgt 10500 

ttcttctgtt tctctgtagg 10560 

agattttaga gtacttattt 10620 

cgctcctcca gcttggcatg 10680 

ggaaaggaga gtcatgttaa 10740 

aggtgtaatg ttctagactg 10800 

agcagtgcag aaagatgtca 10860 

attaaagaaa tgaacaaaga 10920 

atacttaact gggtgggagc 10980 

gtaacgtagt taacgccagc 11040 

ttgtgttcat ataggaataa 11100 

ggttgatatt caaagtgaaa 11160 

ttgacatttt ttatttggaa 11220 

ggatcttgca gtgagaaacc 11280 

tggattaagt gtcaacttct 11340 

catgctattt tgaagacaaa 11400 

catcatagat actcagaaaa 11460 

tgtaaaataa gggcagtacc 11520 

tggccttctt gggtggcaca 11580 

ggtagagcag acatgccgct 11640 

gagaggacgc agctgtgata 11700 

tttcctgctg gcagaaccca 11760 

acagcataca gaggcttgta 11820 

taaacatgat cagtggcaag 11880 

gcaagactat gttgatttac 11940 

ttcacaaatt tcacaagaca 12000 

ccttttagct tcatttgctg 12060 

gctaggccac ctctcaggtt 12120 

aattgtaaaa attacatcag 12180 
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caacatatag 
ttagaactct 
ttggaatgat 
ggtcgaagat 
gacttgcata 
aaacatttgt 
ccaactagaa 
ggtgggagaa 
ctacttgcag 
ggcaggaatt 
acccggaaca 
atgaccatct 
gcaaatcaga 
ttacaggaca 
aactgttgga 
gaaacgttga 
ataatcgtgt 

gggggatatg 
acctcagtga 
acagaacaca 
ggggaactca 
aaaatgagac 
atgaccatgt 
ctggacacaa 
attaccttag 
ttatttttaa 
tgcttgtcaa 
aggctcctcc 
ccaatattgg 
ctgagaagaa 
gaaaaattag 
accctgatga 
caaggaaaaa 
tactatggaa 
actagacatg 
ttacagtttt 
agcaaaagga 
gtttgtattg 
taataatttt 
tgctttacag 
acatttaaaa 
tgaaattctt 
gctctttctc 
ggataagaat 
gaagtagttt 
ttacactcat 
ctcatcgtaa 
agaacctacc 
cccttgtagg 
cacattttat 
ttatttttat 
aactcctgaa 
tgctcctctg 
attgtgaaga 
gtgggaagat 
gtttctggtc 
tttccacagg 
ctgtatctgc 
atttgggtgt 
aaaagttttt 
ttgagataaa 



gcctatgtcc 
gtggagttga 
gtcaagagga 
tgtaaaaatt 
aatctggaga 
ggaaagcaag 
agaatttttt 
ctgaagttca 
gatcctcaag 
caatatctga 
ggaattccta 
tctgttggtc 
catcaccatt 
cccctggaga 
accaaattca 
tgttctatct 
tctaatccga 
atcagaagat 
ctctcgattg 
cccatgattg 
tgcttctagt 
agacttccag 
gctgtgggag 
ttjbgatcaac 
aaf:ctttgca 
ag^taatgaa 

gg^gcgttac 

at^tttgtaa 
ccpgaaccca 
ggpagcactg 
agpgtcatca 
ttcagtgact 
ccbcttgttg 
acagccatgt 
gtaagaacaa 
gcgtgtgaag 
aaaaaagaac 
attattgggc 
actattcaat 
cagaaatctt 
gttcaagtaa 
tatagctgta 
ttatccatta 
aatgacaggt 
acatagtttc 
tttacaggca 
aatgacagat 
tcttagtggc 
gtpgctaggg 
tg^cgttagt 
ctjbtatttta 
gcpaagcgat 
tgbttggcta 
agtgactagt 
gabtaggaaa 
tcptaataag 
gakatttatg 
tgitgtctcag 
ggcatattct 
tttaaaaaat 
tgaagaatat 



ttgagaagca 

attgcactat 

attctgattt 

tgttttaagt 

aagattatca 

tttagaacat 

tgcacattat 

gttagttcag 

taagccattg 

tgaaaagatt 

ctcatcaaaa 

atttggtaga 

atcagcacaa 

tgggtagcct 

ttatttacat 

tctcttaaaa 

gtgaacctga 

gtctgtgatc 

aagagaagac 

ggttgtcctg 

tctacctatg 

tacattctgg 

tcagtgatcc 

ttatcaaaaa 

aaaatagata 

aactacatca 

cacacacttg 

tgttgaccac 

caggaagttc 

tctttatctt 

atcacgcaga 

ttctggattt 

tttcttgcag 

gtctcattga 

tgagagagca 

tactattttg 

tttgtcatct 

aagtagctgt 

gaaacgtttt 

ataatgatta 

ttataaacat 

cttctgtgta 

atcacttaac 

taacctattt 

actgatttca 

aagaaatagg 

gaggattcaa 

tcttatgtta 

cttgtgaatt 

atttttacat 

aagagacagg 

tcttctgctt 

aagaaggggt 

caaaggagaa 

tgtatagtaa 

ggacagggaa 

tcctgcctta 

gtgccttcag 

ctgatctctt 

catgtttcaa 

atatttttta 



gcagttaatt 

gcaagggatc 

ataacaggct 

gcaaacagtt 

ggatttaata 

cacatattct 

tttacattag 

tatggcaaag 

acgtggaaat 

agaaacataa 

ttctgcattc 

ttatgtggtt 

gctaatagca 

ccagctttac 

tttcaacaag 

aatctgctcc 

cgaaaatgga 

gtgtcctgag 

caagttgtat 

ctttttaaag 

ctgcatatga 

agaaagcccc 

agctaaccga 

aaaacttgga 

gagatagttt 

gtgtaattcc 

aacagatttt 

acaagttgaa 

atcctatgga 

caggaagatc 

tccagtacag 

tgtttttcat 

tgctggagtt 

tctcattgaa 

gtgagccgtg 

aaagcttatg 

gttaggttcc 

ttgctatttt 

aaacggggta 

actggtaata 

tgtaaattgt 

atcaaagact 

agagttttga 

tagttggtta 

ctacaatccc 

tttggagggg 

attcaagtct 

cagtataagg 

aagagactga 

gcacagggaa 

ggtcttgctg 

gagattcctg 

ttgtatgtga 

gaggatttca 

taaggtttgc 

acacctttac 

ggcagttagg 

ctcaaaataa 

tcaacagcat 

gatttgcatg 

acaaagaatg 



acctaaacac 

aagtaacaat 

atgaatgagt 

ttttattcag 

tggtgaatta 

tctgtttgga 

gttcaaaatt 

aaaaggcaaa 

taatagtttg 

agccttccat 

atacaagagg 

cacacttctt 

tcattctgga 

cacccaaaca 

atctggaaga 

taatggtggt 

aggtttggag 

aagcaccagg 

tgatcagtgg 

ccaactgtga 

tgtagtggaa 

actagatagt 

gggcttatcg 

atgacaattt 

tccttatgat 

agcatcataa 

tggcagatga 

tgtggcagag 

tgctaccaag 

acactgctgt 

agggtggcct 

atgcaaaata 

ggaagaacca 

tgcagtcagc 

atggtccaaa 

aagaaggctt 

atttattgca 

gatcttattt 

gaaaaagact 

tatttcgttg 

atatgtaatc 

ggggagagat 

ataaaaagtt 

ctatgttcta 

aggaggagta 

ttgggtgttt 

taattgaagt 

gagagcagac 

ttaacaggag 

ggagggtttt 

tgttgccagg 

agtagcaggg 

tttttaacaa 

gctcccaggg 

tatgcaggtt 

agatggaaat 

ggaagggcag 

tccttatgcc 

catctatact 

tggaagacaa 

ctgtatattt 



agcaagtacc 
aaaattatga 
acctttccat 
ctttgaaaat 
tatggcatgt 
cagaccactt 
cctaatgcat 
taaagacaga 
ggaagtagta 
cacaattccc 
gaacctgatt 
ccaaatattt 
atcatcacta 
agctaagaaa 
tcatattaat 
attctacatg 
tcaatgcaaa 
aacacctttg 
ttgggacttt 
gagacattct 
gaagtgctag 
gtccaccagg 
ctggaacatt 
ctggtgccag 
gttacatggc 
gtcagaacag 
cttgggaaca 
ttaaatgacc 
ccttctgcca 
ttaaccaaga 
gaccatggag 
agagggctag 
gcgttcttaa 
cagtttattc 
cacctagtca 
tgctgaagaa 
tgataattgt 
cagaagggca 
agtttttgta 
gcataaaaat 
atattgaaat 
agactagcta 
ccatttcatg 
ggtgttgtat 
gttactatta 
tgcccaagtt 
ccattacttt 
tgttccttta. 
aagaggcata 
atttttattt 
gctggactca 
actataggtg 
aggctgataa 
gtggtaaatt 
tattttgcca 
tcatatcacc 
agaattcttc 
aaagtagcat 
taacaacagc 
atggacatga 
atgtctctgt 



12240 
12300 
12360 
12420 
12480 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 
14700 
14760 
14820 
14880 
14940 
15000 
15060 
15120 
15180 
15240 
15300 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 



wo 99^32644 



7 



PCT/IB98/02133 



gacattgtgt 
acctgttttt 
aaacaacaca 
cctgaaagcc 
tgtcctttgt 
taaatgaaaa 
atgaacctga 
atgaaaaaca 
tttgcagggt 
agaatgatgc 
aaactacctt 
aattcagtaa 
tcaggccatt 
tgtgttctgg 
tttttataat 
gggcttaaga 
ctccccaaag 
aaatcagtgc 
tttactcaca 
catttatgtc 
tacatacata 
gcttccagaa 
cctttctttc 
tatggacaag 
acaagatgca 
ttttaagtct 
acacacaaaa 
agtttaaaaa 
tagcatggaa 
gctcctcagc 
cgttcttatt 
ctcctgctct 
ctgaaaaaag 
cgttatgtct 
gctctttgaa 
atgcttgcta 
nnntgtacta 
tgagaaatct 
cttcacattg 
tgttttatct 
tatttagcaa 
tacagatggc 
aaaacagcaa 
ttggaaagtc 
gatgtaagta 
agagtgtatt 
gtgtatacat 
actcgtttcc 
gggtttggtt 
caaaacagtg 
taattctaga 
ggtgacatct 
gtacttttta 
aaacctgtag 
ataataccta 
aacaacctga 
ttttcttcct 
gatgaggtat 
attacattca 
tgtatctgag 
acatttcact 



tatggaggct 
aagattaaaa 
gatgtgatga 
agtgtgtgca 
gatgaagata 
gagaaatttt 
aattatcatg 
aaaagaaaag 
acatttgtag 
ctttaaaaag 
tattttgaaa 
acatctgtta 
gaggagctca 
atgctcctga 
tatgtagaga 
gatcttccct 
caqtgggatt 
atkctcaatg 
agccacgatg 
atpgataaac 
gcpctgtgca 
ctttatgtta 
ttptcttttt 
atagatctaa 
gapacaaaaa 
ttacaagtat 
cagcagaagg 
taaactggaa 
ttcaaaagac 
cgcgacactg 
ctcatgaaca 
gcagtttaca 
aaaaaacaaa 
cattacagtt 
ggaagatata 
tatttttctc 
tacataactg 
aatttttgtt 
tcttcctttg 
tttagatatt 
atcatcaaag 
acatgggcat 
ttjttctgtgc 
ag^aacttga 
agttatcttc 
caktcttttt 
tttacatttt 
tt^ctatgct 
takttctagt 
cagtatatac 
gtbccaagaa 
taatataact 
attctccaaa 
ctacttttga 
gaagctcaaa 
tcaatatagt 
gtcagctgtc 
agaaagtaaa 
gatttatagg 
tattttcccc 
cttggcagaa 



aaggtgttaa 
aagaatcaat 
aggcgaggtg 
agataaataa 
gttattcaga 
tttttctgta 
aacaagcaat 
ccgtatgttt 
acggaactaa 
tcactggtgc 
atgaggtata 
aaaaccagct 
tagtccctaa 
aggagtgtgg 
cagggtctgg 
ccctgcccct 
gcaggcatgg 
gtcttgatgc 
tcacttttaa 
tttatgaata 
gtttctaagg 
tctaagtgca 
aagatattaa 
aaagccttag 
tgcccagaat 
actcccagtt 
actaatacag 
tgatgtttct 
ttctgccatt 
cccatgtacc 
ttttccttca 
gttctttaaa 
tttaaaacct 
cctgtggaca 
tcttatgaac 
atgaggatat 
ctttctgtac 
aatcatggat 
tatattacag 
gctatatgga 
cacaggtttg 
tcaaaatacc 
agatattaca 
aagctatgaa 
ttacttgctt 
gtaagtgatg 
tgattgctaa 
cgtcatttct 
tgctactgtt 
tttaggtgaa 
tttgcaaaaa 
gtagcacagt 
taattcagcc 
tgcgtacttc 
gctggaaaca 
actcttaggg 
tcttcatgat 
agaagttaaa 
acaagggttg 
aactttatta 
atagcaaaac 



gcatgtgatt 
aggcagttta 
aaactggtcc 
gtttgtttga 
aatcattttt 
ggggatgtct 
tataatgaac 
tcttgtgcct 
tgtgatttaa 
actttaatta 
gctttgccta 
tggtgctagg 

ggggctgggg 

gcaggtgcgc 
ctgtgctgcc 
accgaccccg 
gccactatgc 
aattctggct 
ctctgaacag 
aaaactcatt 
aaagtaatgg 
tttgtctgca 
taaatagtgt 
ctaatttata 
aaaaacttag 
tcttgaaaaa 
gtacatcgaa 
ctcatactta 
ccagttcaga 
caacaggcct 
tctcatctgc 
attaaaaaag 
taaaaaggta 
tgtctgtctc 
agtgttttat 
tgattattct 
ctgagctatt 
ggaaatattc 
atgttttaaa 
gatttgccaa 
tatttcattt 
gttcttatat 
cctgttcttg 
ttttcctaaa 
gctttgtttt 
tttctagaag 
gctgcagaaa 
agtgtctgct 
ccatcagagg 
gatacttcta 
gagtacattg 
agcagaatca 
ctccaaaaaa 
ctaaattgca 
gcctgatcaa 
aaatcactta 
tttgtggttt 
atgcattttt 
aagctancaa 
catgactggt 
agtcaaccaa 



actttagatg 
tatgcatggg 
gcatctaatt 
cgaaagcaga 
attggctacc 
gatgagttct 
ttaaaattac 
tattttgaag 
aaaatgagta 
ttttatttat 
ctggtgacaa 
ctcttggggt 
acttgtcatt 
accaccatgc 
catgctgggt 
cccgaccact 
ctgggctgtg 
tgttggtaag 
atcaagctat 
gtgcaaatat 
aaacctttgt 
aagttgttgg 
catgaccaaa 
atcttgcata 
caccattagc 
tttattctaa 
cacctgtgtg 
cagaataaag 
gccacccttc 
ccagggttac 
cagaatccta 
gttgtgtacc 
ccatattttc 
ttttactaga 
atattgttag 
attttaattt 
tatgatctct 
acaacatcat 
atatcaaagt 
aaaataaaga 
gcatgaaacc 
ttaaatgaag 
tatttttgtg 
cttaccttct 
tcctttgtgt 
tagcattggt 
agctgtattg 
cttcctttcc 
aattgcagag 
aaaacctttg 
tcagcaatat 
ggaaattgtc 
atcccacttc 
tttttattac 
tatagtactc 
tgcctgtggc 
ttattactgc 
ctcaatttag 
ggggttgata 
tcagactatt 
tggtcaatgc 



ccgtatgact 
agcaagttaa 
caggccttct 
ataactagtt 
tctgaattaa 
taaaaagtgg 
ttaaagagtt 
tgacaaatta 
ctagatttac 
gtttattctg 
aagtgtaaat 
agaaaactga 
aggtgtgcag 
ctggctaatc 
ttgaacttct 
ccacctcagc 
caaaactttt 
agaatgggga 
tggtattact 
ttaaacatac 
cacatccctg 
gttaattgcc 
agataatcct 
atccatgatg 
agccatttcc 
aatatgtaag 
cctaccgccc 
ttttaatctt 
tggtctcctt 
tgcttccatt 
cctaataata 
ctttagtgtc 
atagtatttg 
ttgattgtgg 
caatcaatga 
attaccnniin 
gaggctcctg 
tcgtcagttt 
aatgtttttt 
aaatataata 
taggtttttc 
tgggtttttt 
attttacttt 
ccctctgttg 
agctctttaa 
gggtcgaagt 
gtatgtaagt 
ttcttcaaat 
aactggtctt 
tattttgagg 
ttttcccaat 
attgggtaag 
ttatgttttc 
tttaaaaaat 
ttaagctaaa 
tttttttaaa 
ttataccata 
tgaattaatg 
ggaatcttga 
ttatctaatt 
tgctgagaac 



15900 

15960 

16020 

16080 

16140 

16200 

16260 

16320 

16380 

16440 

16500 

16550 

16620 

16680 

16740 

16 80O' 

16860 

16920 

16980 

17040 

17100 

17160 

17220 

17280 

17340 

17400 

17460 

17520 

17580 

17640 

17700 

17760 

17820 

17880 

17940 

18000 

18060 

18120 

18180 

18240 

18300 

18360 

18420 

18480 

18540 

18600 

18660 

18720 

18780 

18840 

18900 

18960 

19020 

19080 

19140 

19200 

19260 

19320 

19380 

19440 

19500 
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tctggcctgt 

tgctgctgat 

agtacaggga 

tgttaatttc 

tacctgccag 

tctagtagtg 

taatattgat 

ttccttcaga 

tctgcatgta 

actgacctga 

ccgagtgcct 

gggaaggctg 

tgattttaaa 

gaaaaatgtg 

atccatagac 

atgttaggaa 

gactgttgtt 

gtttcctggc 

ttcgcttgac 

tcgagtcagc 

tgacaacact 

tgtaatggtg 

gcctcagcct 

tattttcagt 

caggtgatct 

ccctgcctgc 

cacttttctt 

ggagttagag 

gtgtgtcagt 

ctcctttcag 

atggcattta 

catggcaggg 

aggaactgct 

ttccactcct 

gaagattctc 

tatgaagtta 

tgtaatctga 

gtagcaaata 

gtttccacta 

tgtattttgg 

aatataacca 

aaaaaattat 

ttccaagatg 

ttattgtctg 

aagacatgac 

gggctaggga 

gcccaatagt 

cactgtaggg 

tcatatgtat 

atccctgatt 

tagtattaaa 

tagatttccc 

ctaaaacttt 

ttttttaaca 

catgggtgat 

gagctcatgt 

ctcatcagga 

acctgtatca 

tcccatcttc 

tgtgttggaa 

gctactgtta 



gcagacatat 
ggatgtttct 
atggctcttg 
atgtagagtt 
cttttctttg 
acttgtggag 
cactagtact 
ttaaggactt 
tttgggacta 
atttcattta 
gtgactttgt 
ctttttgaaa 
ttjtctttaga 
cajtataaaaa 
gtggttcatt 
tcacaacaga 
ctgcttagac 
ctgatactca 
acatccttgt 
aggatatagt 
ttgccttttt 
caatctcagc 
cccgagtagc 
■ agagatgggg 
gcctgcctcg 
tatttgcctt 
gttgaacagc 
agacccaggt 
gggcaaatta 
gcaccttaca 
aaagtggttg 
aagcgctgca 
gtttcatgag 
ctcctccaag 
acacagccaa 
ttgtcaaagt 
tgtcctatct 
tt^gtgacac 
aaatattcaa 
aa^tcttaca 
taaaaccatt 
tgcaggaaaa 
tg<:tgtagcc 
cttacttttg 
tagaaagaaa 
gagggcttca 
atagatatct 
gagaatcttt 
aataattata 
aaaaagagag 
atggctaaat 
atagctgaga 
attgtaattc 
tttccttcct 
gatagaatta 
gtgtattaac 
gatagtttgt 
tttgtctcct 
tgcacccact 
attttttctt 
tcaattcata 



tggctgtttt 
tccaggtttt 
atagatttga 
gtctgtttaa 
tccaggtttc 
tgggttctct 
gttaatttgt 
ctagaaaaca 
gaaggtacta 
tcagtttaga 
atcaccgctc 
gccttttaaa 
attacaggct 
tttgcatgta 
gtctgattgt 
gtatctctga 
tttctagttt 
aaagaattga 
ctctacattc 
ggctgttatt 
ttttttgaga 
tcactgcaac 
tgggattaca 
tttcaccatg 
gcctcccaaa 
tttaatctca 
atgcgtggtg 
tcctgttcgg 
cttatctctc 
gaggctgtct 
tgatgacagt 
catgcagtat 
gatagggatg 
ttaatgggaa 
atttattgct 
aatataatct 
gaaaggtagc 
aaagctttta 
gtttaagggc 
ctttctctta 
tgttttaaaa 
aatgctaaac 
tggtaaccat 
catttggtac 
tatgtttatc 
acaggaattt 
caacccaata 
gcaagcaaca 
agtcttaagt 
atgtataata 
cttctttggg 
gctaatcatc 
ctctaaatcc 
ctgactcact 
ttcttttcac 
tagagaagtc 
agccatttac 
gaatttcact 
ttgccacttc 
tttgaaagct 
ccatcttacc 



acttctaata 
aaatatcaaa 
ttttcctgca 
caggattctc 
agtatgaact 
gaacatttct 
gtgcttacta 
tccatgaaaa 
tgggaaggat 
gaaccacttc 
tggcaccaca 
attctgtaag 
ttagtcagta 
gttttagggt 
gtttaggtac 
aaafgtaatt 
gtcttctgcc 
catttaaatt 
tgtctctgtc 
tcttcccctt 
tggagtttca 
ctttgcctcc 
gacatgcacc 
ttggtcaggc 
gtgcagggat 
tgaaatgttc 
agtagaatgt 
cattgcagaa 
agagccttat 
cctaatcctg 
catagctcac 
ctcttggact 
aggaaattag 
ctatgactct 
atcttagtta 
cagctgtaac 
tgagaataaa 
tattttgact 
atagcccagg 
aaagttcttg 
taaaacccat 
ctggttttta 
acagaaccat 
aggtataaca 
agttattatt 
atatacttta 
acacaggttg 
ttctacttat 
catcaagaaa 
atggataaga 
atattctgac 
ttgtaatctg 
tcagctttta 
ccttttgttc 
agattaacag 
tcccttacat 
tttcaaatcc 
tcatttcctc 
ctgtttgttt 
caactaacaa 
cttgtttttg 



ccattctgct 
caaaagggat 
tttcctttat 
ttaaaattcc 
ccactcgatt 
ggaagtgttg 
catgttggct 
aacagattaa 
aatcttcata 
cccttccctt 
tcctcatccc 
ttgagaaaat 
tatgacagag 
ttcagagacc 
ccttctaaaa 
agcggaaaga 
aggcttgccg 
agtctctctc 
tctgttagct 
atccttcaac 
ctcttgttgc 
cgggttcaag 
accacgcctg 
tggtcttgaa 
tacaggcgtg 
tcttttcttg 
tataaaaagg 
atgctgttct 
tggtaaggtg 
gtagcgtacc 
cattagcata 
acacagggcc 
acttgctgcc 
gctttggctt 
aattatgcca 
tgagatagtc 
caagaaataa 
agttaagcta 
gcagcttatt 
ggaggggcat 
ttttaaaatt 
actttgtacg 
acagaattag 

gggtcgatta 

tcttccatct 
gagaaaagtg 
tgtctgtctc 
agggagccat 
aagttaactt 
gatttttctt 
tagtatggtg 
tggaaaactg 
ttttctacag 
tcattttcat 
ttttcttttc 
ttcattttta 
aagtttctgc 
tttaaaccat 
aattggcaag 
cttctaggaa 
caaccctttg 



tttcctgtcc 

ctgtgggccc 

tttgatccag 

ttcttcagtt 

aatagagctc 

ctgatagtga 

tttatatgta 

aaaaaacaat 

ctcagaccat 

caccctacct 

agcaggattt 

actaggggaa 

ccttttccta 

cctaaagcct 

cccttttgag 

acatttcaaa 

gaataaatga 

ttcccttgtt 

tatttctctc 

gatctacttt 

ccaggctggg 

ccattttcct 

gctaattttg 

ctcctgacct 

agccactgtg 

gctgaagtgt 

gatggacttt 

gcaataggct 

tgagtgatag 

tggctcatag 

gcgctggatc 

ctcatgaatt 

cctcactgcc 

gattgccatg 

gaacacaaaa 

agaaactgtc 

agagaattca 

gttcttaaat 

atgaacatga 

gtgaggccat 

cttccaaata 

ccaactatat 

ttctcagaat 

tatggtttct 

aaattagaag 

atcattgata 

tgggatcata 

aacaaaagtt 

gtgaatgata 

ggttaatttt 

cattgtctaa 

tcctctttgg 

actttttttt 

ggcctgagaa 

gagtatcgtt 

tgttttcttt 

ggttcttaag 

gtcctctgtt 

ggccactctc 

gttttttatt 

ttaataacat 



19560 

19620 

19680 

19740 

19800 

19860 

19920 

19980 

20040 

20100 

20160 

20220 

20280 

20340 

20400 

20460 

20520 

20580 

20640 

20700 

20760 

20820 

20880 

20940 

21000 

21060 

21120 

21180 

21240 

21300 

21360 

21420 

21480 

21540 

21600 

21660 

21720 

21780 

21840 

21900 

21960 

22020 

22080 

22140 

22200 

22260 

22320 

22380 

22440 

22500 

22560 

22620 

22680 

22740 

22800 

22860 

22920 

22980 

23040 

23100 

23160 
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atttatttaa ctatagttat 
catatatcta cccagcatca 
cctagtaccg agtcctaaaa 
acatatatat ac^cactgaa 
tttataattt ccbctttctc 
agctcatttt gtltccataca 
tttatctttg aaataaaaag 
aaatgtatat ttltttaaata 
agtacttttg aafcrcccattc 
ggtcctgtct ctgcagttga 
ctaggacatg tgpgctacgt 
tactttgctc agptaacttg 
tttttttttt ttjttggagac 
cacagctcac tgcagccttg 
agtagttggg accgtagtgc 
atgggggtct cgctgtgttg 
cctaccaaaa tgctggatta 
attttaaaat ttaacaatat 
tttattctca gttgaagtaa 
tgattatcca tatgcggttt 
gataccaaaa tctaaggtgt 
cctgtgcact actttaaatc 
catgtaaatg gttgttatac 
ttaattttta tttgttcaca 
ctcatggatg tgaagggcca 
gttcaaacaa ataggaaatt 
gataacagta attagtgatc 
acttactttc tcqagttctg 
ttctcccaag gcctctctcc 
ggtttttttc ccjtgtgccca 
tacactggat gafctaatcct 
taaccttaat tapcacctct 



gggctttagt agactgattt 
tctttagcag gattgatttt 
cgctttttag tgactctgta 
gatcaggttt tttattttta 
gccaacaaaa ctaaacttca 
tatgttgaga caaggcaatt 
taaatataga ttaaattctg 
cctttttgat gactctggcc 
attgtggcag catggaggaa 
gcgaaacaag ttgcagagct 
atttttctta ccagctctca 
tttgacaatg tattttccca 
accctctatg tacaggtagg 
ggtcctttgc tggtcagtgg 
agtctataaa tagaactaac 
cttacttctt gtcttcttta 
agtgcaaatt ttgaatgtct 
atgttaaatc attcagtgta 
agttcaatct gctttgtata 
ccggagacaa agcaccatga 
tgtgctcaga agcttagctc 
ctgtcttccc tcctctcctc 
ctcatggtga aggaagaatt 
ttattcacat gcctcttcta 
tgttgggtga ca^aagcaga 
taacccttga gaaccaagct 
tgaatttggg ca^aattgtg 
aatatggttt atcgactttt 
tgtccatccc acttctctcc 



tagcagtctg 
ttgtaaggca 
actcattagt 
tatttagtgg 
attaaaaaaa 
catgctgctg 
cctttcaaga 
tctcagatag 
atagtacaac 
ctggattgtt 
gctgaaagaa 
tttccatgct 
agtctcactt 
acctctgggg 
ccaccatcat 
cccaggctgg 
taggtgtgaa 
acttcatgtg 
ttttgtttgg 
tccttcagca 
tcaagaccct 
atctctagat 
tttatttttt 
tatttttgat 
gctgcagtaa 
taaaggcata 
tgtatgatat 
gaggctagac 
caggcttgca 
agcacccctg 
tctacagaga 
aagtccctct 

gsrgggaacac 

ttaaaatccc 
ataagtgggt 
aaaatgtgta 
tgtgttttcc 
ttcagctaaa 
aaatgtagtt 
aattgttatt 
tctatgtaaa 
acgtggacgc 
gtttctaaat 
tgtgtaatta 
caggtttttc 
accttaaaga 
tgcattttaa 
gtatcagatt 
tacatgtaca 
aataatattt 
gaatgctgat 
tcaccggggc 
gtgtgcacag 
ttgccctccc 
cttcatggtc 

ggtgttttta 
atggaatgag 
tggtgtcttc 
gtcattgcat 
tacaaaaaaa 
cagctgtcca 



agatcatttt 
tgtgagacct 
agaagatgaa 
caattcatag 
ctttgttttc 
ttggattatt 
aatgcaaaaa 
atttaaagaa 
ctgtgaagag 
gctgacatct 
gggttaaaat 
tttctctcta 
tattgctcag 
ctcaagtgaa 
gcccggctaa 
tcttggactc 
ctgccatacc 
aatgtatggt 
catttttagt 
tctgtgggga 
catatagaat 
tacttataat 
atttgtatta 
ctgtgatttg 
aatgaaagag 
gaatttgata 
taaaaaaaaa 
atccaagatc 
gacagcatcc 
gcactgcttc 
ctgctaaggt 
ctctgaatac 
acttctgtcc 
taaagatcgt 
ggaagaattg 
tttaagaaag 
aatacagtgt 
tgttctttag 
ttcattctgt 
ttaaaagtaa 
gcgcagtgcc 
aggaactcca 
ttaagaatta 
ctaattcagg 
agccatcaga 
ggaataaaaa 
cctgacattt 
tggttttaga 
gaaaaaccaa 
tataaaactt 
tgctgccaaa 
gacttgggct 
gctttccctt 
ctcaccaggg 
agctggcgaa 
cattagaaca 
atgaacagtg 
aaagggtctg 
gtaggtctcc 
atttaaaaat 
gtcaccccag 



acttggttac 
ttgtttgatt 
gtgtccttgc 
ttgcatttgg 
tagactttag 
taggtatttt 
aaaaaagctc 
attttaaaca 
cctcatgtac 
tggccatcag 
ggctgccatt 
tatatgtagt 
gctgagtgca 
cctcctgcct 
attttctatt 
aagcaat'ctg 
caaccctata 
ttttaaaatg 
ggtgtgtatt 
ttggttttag 
gggatagtat 
atctaataca 
ttttaattgt 
ttgaatctgc 
caaaaatgca 
ggcaattaca 
aagcaaactg 
aaggtgttga 
ttcttcctgt 
ctcttcttag 
cccactctga 
agtcacagtg 
gtaacagtgc 
gagtattgac 
ggagttaaat 
aaagcatttt 
cacatgcagt 
aagctaatgt 
actttttgca 
atgctctttc 
aaatttaacg 
gtaagagcct 
aattaaaatc 
gttatgctga 
aagattgctg 
gagcatttgg 
aagctagttt 
agcagcaact 
aaaaggatga 
tattccacaa 
ggcttttccc 
ttctctttca 
tcctttctgg 
gtcctgggca 
gtgcctggtg 
ttgcatctgt 
accctttatc 
tttagtctga 
aaaagacaga 
catgaattta 
caatggatga 



ataaggagca 
gctgtcctaa 
cttttgctga 
ccattttttg 
gatttagaga 
gtgactgtat 
aaaaaacaga 
tcctaatcat 
gcgctaactg 
gcagaatgcg 
gtatgggtgt 
ttataaattt 
gtggtgtgaa 
ctgcctccca 
ttttgtagag 
cctgtctcag 
aaaatgttat 
ggtttaatag 
tatatacgtc 
aaccaccaca 
ttgcatataa 
ttataaatgc 
tatattattt 
agatgtggaa 
aatgtacaaa 
ttaaactgtt 
tatatataaa 
cagggttagt 
gtcctcaggt 
aaggactagt 
ggcccttttt 
ggaactatta 
cacataaata 
atgttaagga 
ccatctgatg 
cattttaact 
ttttttgaat 
ttgaagatat 
agagaagttg 
tcccgatttg 
agaaagagat 
acccgttttt 
taagaattgt 
ggtaacagaa 
taaacaacta 
tgtcgttcag 
acaagctcat 
gttttctgtt 
atctctacaa 
aagtggggag 
ctggttccct 
gtacatgaca 
ctccctccct 
gcagctggag 
tgagcattgt 
tttgggcatg 
ctgttatagc 
aacagtgtgg 
ataagttggt 
taccttaaaa 
ctgctgtgga 



23220 

23280 

23340 

23400 

23460 

23520 

23580 

23640 

23700 

23760 

23820 

23880 

23940 

24000 

24060 

24120 ^ 

24180 

24240 

24300 

24360 

24420 

24480 

24540 

24600 

24660 

24720 

24780 

24840 

24900 

24960 

25020 

25080 

25140 

25200 

25260 

25320 

25380 

25440 

25500 

25560 

25620 

25680 

25740 

25800 

25860 

25920 

25980 

26040 

26100 

26160 

26220 

26280 

26340 

26400 

26460 

26520 

26580 

26640 

26700 

26760 

26820 
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gttccttctg 

ttgggtgatg 

gctgccagtg 

atgtggctgc 

atctgagtgt 

tggtgtctgt 

ccccagtctt 

ttagatgcat 

ccttttttct 

gattcttcct 

tgctactgcg 

tgtgctagtg 

attttattat 

tatctgagtc 

gtcatacttc 

ggcaagttgt 

tgaggagtgg 

tttctcacag 

tcccgctaga 

gagtggctct 

ctcgctagtt 

tctcaggggt 

atgtattcct 

ttcatccgtg 

tgttccatag 

ctgcttttcc 

ttctttgtct 

ctccttgtgt 

aaaattgata 

ctaaacaacc 

ggttagttgg 

ctacagcttg 

acatacttaa 

gttagattta 

tttttttttt 

ggcacgatct 

cctcctgagt 

tagtagagac 

ccgcccgcct 

tcttcatcta 

aaactaagaa 

tttaccaggt 

aaatccattc 

tttttgtcaa 

ataaaggcat 

cacttgtttt 

cctttgctgc 

ggagtgtctt 

tagcgtgggc 

tgcagccagg 

gagtagagcc 

ggtccatggg 

ttttatgctt 

tcttgggtac 

ttgattcaaa 

gcaaaattcc 

aggccgtgct 

tggatggttg 

gttcttatga 

aagtcaactt 

ttattcagta 



tgtcctgctg 

ttggcatcct 

gcgctagggc 

cttcgtgagt 

cafcttaaaaa 

taittaatttt 

tcttctccag 

tattcgtttt 

ttttgtgtta 

ttggctgtgt 

tgtttcattc 

tatctatctg 

acttgcagtt 

tgcaaatttt 

ctaagatttt 

cttgatacct 

agtcagtcca 

cttcaggttt 

gcttttcctc 

cttctttatg 

tggtggtaga 

ggagccttct 

gcctttacta 

ccpcatgggt 

gaaaggtaga 

ctpccacaag 

tgj::gagtaca 

ctgtagccca 

aaatttttag 

acjttgcatct 

ttpccttgca 

tcpgtgttgt 

tggaagccgc 

caaaaacgtc 

tttttttttt 

tggctcactg 

agctgggatt 

agggtttcac 

cggcccccca 

gctttaacat 

ataaacattg 

tttccactaa 

aacttcagtg 

atctgttatt 

aatttaaaag 

ctgctgttct 

tggctgtgtt 

tctccggaga 

cccacccaaa 

gttgcagatg 

tt|tggtgttt 

ttjttggtttc 

ac|ttctgctt 

ttttaaaata 

taacctagtc 

agbcttttat 

gagcagcttt 

ttfecatttct 

ttcttttgtt 

tcgtgtggct 

ttagtattta 



tgggcattgt 
atgcacagtg 
tgaaaaaatc 
ttcttcttgt 
tttttacctg 
ggaaatttct 
ttatgtttgt 
ttgttggttt 
cattttggat 
tgagtctact 
ctcacatttc 
atcataaagc 
ctcttaaatt 
gattacttta 
gcctaacgct 
ggaaatggat 
ctgaggaggt 
ctgtagaact 
agtgtctatt 
cctttcccca 
aggagaggga 
ctgatcctgc 
agagtttttc 
agcagggttt 
aagaaggatg 
cctacatcca 
caggaggtct 

ggggttcgtc 
ctaaattctt 
tgtttctcct 
accttgcagc 
tgttgttgta 
ctcccatttt 
agaaagaaca 
gagttggagt 
caacctctgc 
acaggcgtgc 
catgttggcc 
cagtgctggg 
ctaatgttga 
gtaccacact 
tgtccttttt 
ttttaaatta 
ttggttttaa 
gtgtgttggg 
tgcttatggt 
tggttaagtt 
acatttctac 
cgagattctg 
cacgtgagac 
cgttcacttg 
tgtgcccttc 
tactgtgttt 
atgttatatc 
tggttatggg 
gttcctagcg 
cactgtaagg 
taatatgtac 
tctgaaggtg 
tactagttct 
agaaatgcag 



atatatgaag 

gtcccttgct 

agctctttac 

ttttggtttg 

gattggtcct 

ttgctcttat 

gttggttcat 

ttttttaaat 

aatttctgtt 

ggtgagccag 

cctttgaccc 

ttagtcacgt 

ccctgcttga 

tctcttcaga 

gggccttttt 

agactt^tct 

gcactgcatt 

cattactttg 

tcacactcag 

ctatacttct 

agggaagtgt 

cttgcttctg 

cctgttctct 

tgttgcccct 

tgggctgggc 

gtcttccctg 

gtgggtcgag 

tgttccactg 

tttactggta 

ttgagttttc 

tctctgaagg 

gggttggcag 

tggttaataa 

gagtgttcct 

ctcggtctgt 

ctcacgggtt 

accgccatgc 

aggctggtct 

attataggtg 

catcttacat 

attaattgta 

ctgttctaaa 

ttgtttttca 

tcttcaagct 

ttatttcagt 

actttctttc 

atttgtggaa 

ctgttttagc 

agttgaaggt 

ctgctcacct 

tctgattctc 

atcttatgag 

gcttaatttc 

cagcttttta 

ctacgagaat 

cagtgtggat 

tcactgtccc 

gccctgtgag 

aattaagtaa 

catgaatcta 

aattttgttt 



caaatgaaga 

tttttgcccc 

acttgtcatg 

cagcagttta 

ctgagcttgg 

ttccttaaat 

ttctcgctgt 

tttttttttt 

gacccacctt 

tttaaggcac 

tgtttcatag 

tttccagttg 

taattccaac 

ttgtgcttta 

tgtaagacag 

ttctgcttgg 

tgggttttgc 

tttgtaggtt 

cgttttcaca 

tggatacttg 

cttttcattc 

gctgtaagtc 

tcacccagcc 

gttcatcagt 

cctgagccct 

accgcagtgt 

cctgtgaaat 

gctcatactt 

tctgttacat 

catctttcct 

gtctaagaaa 

tagtattcct 

atttcaaaac 

gtttattctt 

cacccaggct 

caagcaatct 

ccggctaatt 

cgaactcctg 

tgagccacca 

aacatggtat 

ctacagattt 

atacaatcca 

ttatatgaag 

tgtctttgtt 

gcctaaagtc 

cttgtttgct 

atcagttgaa 

tgggcccctt 

gaactgagcc 

ctcatttact 

tcttcacagt 

tcttgtaaat 

agtcttaaca 

agttgttttc 

agcctccctg 

aacagactgg 

aggtcgggtt 

agcggataca 

gtgacatggt 

ttccatgatt 

caaaaaatat 



tagctgcctt 

catgaatata 

tgtcttgttt 

agtatcatat 

atctatgatt 

attattccta 

tctttagttc 

acgccccctc 

tgagttcatg 

tcttcatctc 

tttccatctc 

aacctttatc 

atctgggcca 

tcttgccttt 

gagaaatgga 

cttttagtgt 

tcatgtgctt 

ggggatgtcc 

tagcaccttg 

ttactgaact 

ttagggagaa 

tgtgcccagt 

tcatcgagta 

ttcaggctgc 

tcccacaggg 

gttttctttt 

gtgctgcatt 

ggctttctgc 

tggcccccaa 

tagacttttg 

agtcatgaat 

tcagcattct 

ttggaacaat 

tatatagctt 

ggagtgcagt 

cctgcctcag 

tttgtatttt 

acctcttgat 

cgcccagcct 

atatttgtca 

ttattcagac 

gaatagatac 

tgctgtgtgg 

tctttaagtg 

ttgtctgagt 

ttgttatctt 

gcctcaggtg 

aaggctcctc 

attcaggcag 

ttcaccctga 

tctattagaa 

caaagttctg 

tcttgccaac 

agtaggaagg 

ttttttgtgg 

caggttcaag 

tctaagaatc 

tcttgctcag 

agaatatgtt 

gtatcagttc 

atttgtatta 



26880 
26940 
27000 
27060 
27120 
27180 
27240 
27300 
27360 
27420 
27480 
27540 
27600 
27660 
27720 
27780 
27840 
27900 
27960 
28020 
28080 
28140 
28200 
28260 
28320 
28380 
28440 
28500 
28560 
28620 
28680 
28740 
28800 
28860 
28920 
28980 
29040 
29100 
29160 
29220 
29280 
29340 
29400 
29460 
29520 
29580 
29640 
29700 
29760 
29820 
29880 
29940 
30000 
30060 
30120 
30180 
30240 
30300 
30360 
30420 
30480 
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taagttgtga 
gaacttattg 
cattttgtta 
ttcttccagc 
ttggttttga 
ttcaggtaga 
tttatgttat 
ctaagttttt 
gggccttgct 
tctagagcac 
atgtgttcag 
cttggctgtt 
ctcactaagc 
tggaacgctt 
tcagggaata 
cattggagca 
gcaattcatg 
acagatataa 
gagcacatac 
ccttcaatgg 
gagatgagcc 
cttttcctat 
ttatatgtat 
taatttccac 
tttctccttt 
ttatttataa 
gtaaccatca 
tagatgggtc 
tatatatcct 
aaatagtgta 
ggttttttgc 
tgggcttttc 
aggattgaat 
ttatctcagt 
cctaagaatt 
gtttgttgac 
tcatctagta 
gttttagaaa 
cttgctaaat 
aatcctttcc 
ttcctagccc 
tctttttctt 
gtatttcttt 
cagttcctgg 
catgccttcc 
accacatgag 
ctgatagccc 
cattttcatg 
tttccttcta 
cttttctaca 
taaattattg 
gatacatata 
tgaatgaaca 
cattatacca 
atgatcatag 
tgttgtctga 
catgagctcc 
acttattttt 
ctcaagctag 
ataagcgtaa 
ttatattatt 



agaaatacat 
gttgtggatg 
gactatgagc 
tttcacttaa 
aattagattc 
ttttctggat 
ggagagacgg 
tcctgccact 
ctggattagg 
tcaaactttc 
tgaagtagca 
tagtggaaag 
catttctagc 
agcagccatt 
gggaaaccca 
gtcagaacac 
gcacccccaa 
ta^tatgaaa 
tg^tggaaaa 
ga^aaaaatg 
tgtcactcct 
ttgtgaatag 
ctjtttgaaga 
ttj:ttgtatc 
gt|tttcttct 
ttaatgtttt 
ctatagaaaa 
tgttatccct 
tatgccaata 
gttgtcttct 
atttctgtgt 
attttgattg 
ctttggattc 
gtgttttgta 
tcagtttttg 
ctaggcatat 
acttacaaaa 
taacagtttt 
tttctggcta 
atattcctca 
cattgccctt 
tgcaatcata 
tcptcacttg 
aapataataa 
tgbtgtctgt 
ttattaacct 
ackgtaattg 
aa^atgaatg 
gtjtgctataa 
ttttcttaac 
ttfcaataata 
atattatctg 
aaaagcttgc 
gtatctaaag 
gtcatccaga 
gtactgtctg 
tgtcttgttt 
gtcttttggc 
gagtttttgc 
aaacttagac 
ttagtttcag 



ctccataatt 
caaatgaagc 
tagtaaaact 
gttccttttc 
aggtttaaat 
aacttgctat 
cttctttcct 
tctttacctc 
atttgcttta 
tccatatcag 
cttttaattc 
gacctagctt 
tattgatgta 
gtagggttat 
aggggcggta 
acacgacatt 
acaattacaa 
tattgtgaga 
atggcaccaa 
caatttccgt 
aagaatgttc 
tatctttttt 
acatactttt 
ctatttaaga 
ggaaatttta 
catatggtgt 
gattatttcc 
agatcaatgg 
ccatactgtc 
aaatttgttc 
gaattataga 
tattgaagat 
atgaacgtag 
gtttaatgta 
atactattgt 
atttgacttt 
tatattcctt 
actttgtcct 
gacctcctag 
tctttaggga 
cctaaatttt 
tcatgatatg 
tccatgaagg 
gtatataaga 
caatgttctt 
gagaaataat 
ctttcatggc 
tgtggtgttt 
tataataagg 
agatctggtg 
atattaatta 
ttaatttcta 
atatttgcgt 
aaaaaattca 
tgaaggaagg 
agatctggca 
cctgtattct 
tgtcaatcaa 
tgtataattt 
taattgatta 
tttatataac 



attgctggga 
atatttgtga 
tatggcacaa 
agataggagg 
cccagatctt 
agcttatacg 
taaacctcac 
tctcagcctt 
agggagtgtt 
caataaggct 
tctttaagaa 
ttgacctacc 
aagtgagaga 
taattggcct 
gagagaaaga 
tatcaattaa 
tagtaacatc 
ttaccgaaat 
tagacttgct 
gaagctcagt 
ctgtacaagt 
gagtacgtgt 
aagcttaatt 
agtccttgcc 
gagttttgct 
aagatcgaag 
ccccaatgtt 
agcatttgtt 
ttaataatgc 
tttcttttca 
attagctcga 
atagatgaat 
cctgcatttg 
cagatttgca 
agatgacatt 
ttaatatact 
aggatttcct 
ttttaatctt 
tacagccttg 
aaagcactca 
ttctcatcat 
taacgacatg 
gaaggaccat 
aatagtttct 
ttaaattaaa 
cgttttattt 
tttgaatata 
ggaactagct 
aattttgtat 
aatcttcatt 
ttaaaaataa 
agttaggtgt 
ggaagctgaa 
gtaccacata 
cttctgtacc 
agaatgaatc 
gtttgtattt 
agttattagt 
taatgtttct 
cttattaaac 
aaatgaggtt 



caatacagta 
taaaaataac 
acatggagac 
cagcctggtg 
ctgtttaatc 
tcagtacttg 
gaaccaacct 
cagagaatta 
gtggctggtt 
gttttgcttt 
cttttccttt 
ttggctttca 
catgcaactc 
aatttcaata 
gagacaggag 
atttgtcatc 
agagatcaca 
atgacacaga 
cgatgcaggg 
aaagcgaagc 
tttttgcatc 
gtttttttat 
tattgatttt 
aaacttaagg 
tttacattta 
ttcatatttt 
tgaaataagt 
ctgttatatt 
ttgctttgca 
aagttgtttt 
caatttctac 
ttgggaagaa 
tttacttagg 
catcttttgc 
taaaaaaatt 
aaccttgcta 
acataaacaa 
gatggctttt 
actagaactg 
ttcttttatc 
tttccttcat 
tttttattta 
atgtgttgtt 
gaattagctg 
catctaagac 
ataaatgact 
aaccttactg 
ttaatgtttg 
gtttttccta 
attaaatata 
tataaattat 
gggttctgaa 
agtacgaaat 
ggtttttaag 
agacgtacag 
caataaacgt 
gaaaagattt 
gtagtttttg 
gtttttactt 
gtccagcttg 
tcttataaat 



ttttcttaag 
taatagaagt 
ttaacacttt 
gataagagta 
tttattttat 
ccacttcaat 
ctgctagctt 
aagggagtta 
tgatgtttta 
ctaatcattc 
gcatccgcaa 
acataccttc 
ttcctttcac 
ttgttgtgtc 
aacaggccat 
ttatatgggt 
gatcacaata 
gacgtgaggt 
ttgtcataaa 
atgataaaat 
tgttacttac 
ttttatacat 
ttttctctca 
ttgctaagat 
gttctaggat 
tttaatatag 
agactgaata 
gatctatata 
gtaagttttt 
ggctatttta 
ccaaagtttg 
ttgatataac 
tcttctttat 
cagatatatc 
tcaagttttt 
aacttattta 
tcatgtcatt 
atttcttttt 
gtgtgaggga 
cattctttag 
cacaccttgt 
tctgtttaat 
atcctttgtg 
tgaatgaatt 
agcaaataat 
gagttgaaag 
ttacaaaaca 
tcttcctgtt 
attgtaccca 
attatacata 
taaatataaa 
gactattata 
ttttagatac 
taggagctgt 
aggtagacag 
agttttctcc 
ggtgtgcata 
taactcagtt 
tcctaagcag 
atattcttct 
aaaatttaaa 



30540 
30600 
30660 
30720 
30780 
30840 
30900 
30960 
31020 
31080 
31140 
31200 
31260 
31320 
31380 
31440 
31500 
31560 
31620 
31680 
31740 
31800 
31860 
31920 
31980 
32040 
32100 
32160 
32220 
32280 
32340 
32400 
32460 
32520 
32580 
32640 
32700 
32760 
32820 
32880 
32940 
33000 
33060 
33120 
33180 
33240 
33300 
33360 
33420 
33480 
33540 
33600 
33660 
33720 
33780 
33840 
33900 
33960 
34020 
34080 
34140 



wo 99/32644 



12 



PCT/IB98/02133 



atgcactaaa 
tagaacgttt 
gtgttggtac 
ggatctgaaa 
tttaattagt 
taccaatgaa 
tgcattttgt 
tactgtagga 
aaaacaaagc 
atttttccag 
gcatttgctg 
ataataacat 
tgaatgtatt 
tccttaattc 
ttacgataca 
tagatggaag 
atgacagacc 
gaagttgtgt 
agtagatcta 
gttacacaaa 
tcttaatatc 
accaaattgg 
agcttgactg 
ccttgttggt 
taatggtaaa 
gagcattcaa 
aattaaaaat 
atgacatcat 
attaagtaag 
cttttgactt 
cttttgcaat 
aaccactctc 
ccctcaattt 
ccattatcct 
ttcatttctc 
cactttttgt 
gactgctgat 
catttgtaca 
tcagcatatt 
aactccatca 
gttgcatacc 
gcagaagtgc 
tttaactaat 
gtggaattaa 
tttttttttt 
ctttttgaac 
gatacctaga 
agagttcttt 
agacttaact 
gcaaatggca 
agtgcttcat 
attcaagaga 
agaagtagaa 
aagctagtgc 
ccctgtggct 
tctcttacat 
ggctgcagtc 
tggttcgggg 
tccagtgctc 
ccatatagct 
gatttctaat 



ggagctgtgt 
cacatggtgg 
ttaacgatac 
gggcaaaaac 
gaattattag 
tgtagcactg 
ttgcccctct 
aatactctgt 
tcacacaaaa 
aaggtacaag 
ccpaacgtgg 
ttttagtttt 
cabtccttga 
acbttcatgc 
tajttcttgtg 
ctttttctac 
ta^gttggag 
aafcgttgctg 
aatctgtgtg 
gtjtggtccac 
catattttag 
gaagtataca 
tgatttagtg 
gacattacag 
gttcttcatc 
catgtgacta 
aagggggagt 
ttgtaacagt 
gagtgtttca 
tcattgcttc 
gccattgtct 
gaatctgtag 
gttttggtct 
cccaactctc 
cctgctacat 
ttctcatggt 
gactcgcaaa 
atjtgtccatt 
gaagatagaa 
ttpagttttt 
tcttatagga 
aggcactgtg 
tttccttaga 
atgtatctta 
taattgccat 
tckcttaaaa 
atttccatgt 
caaaacagga 
gaataaaaat 
agctaataat 
aattagaaaa 
tacaaatgac 
attaaatgaa 
tcagctttgt 
gctctcaggc 
ttgagaagtc 
ctttgtggag 
tcctgtgctt 
tgcttttacc 
gactctggtt 
ctctctttcc 



gaaataggaa 
gaatttacta 
tgatttctaa 
tcattgaggc 
catataatta 
catttaaaat 
tgaaacgaag 
tagcattagt 
taaaccaaat 
gtataatcca 
taagtaaaaa 
tcttcctgga 
attagtgtac 
tattattaca 
ccatggattt 
agtgtatggg 
tccaaactcg 
agcttgcttc 
aggattagat 
agtgcttgga 
aaaattgaat 
gaaaacagtg 
tgtgatctcc 
cagggcctat 
tgttctgtcc 
gtgcatgaaa 
ttttacaagg 
acttttaaaa 
gaataggagg 
ctctgtctaa 
cttttgccct 
tctacctttg 
gatttgaaat 
tggcgattac 
gtgttatttc 
ggccttcctc 
agcttcctcc 
agagagcttc 
tttatccttc 
ttgcctaagt 
aacttagaca 
gtaatattta 
cttgttttag 
atctgccacc 
ggttaaaacc 
gaaaacagaa 
tattcatagg 
gaacaaaggg 
tatttttatg 
attttataat 
gacataaact 
ccacacactt 
tactttgaag 
gtggtaacgg 
agggccacaa 
tgaaatgggt 
gcttgggggg 
ggtctgggat 
accttgaagt 
ctccctcctc 
ttggcccttc 



ttctgtgtga 
tatgattttc 
aatttgtatt 
tttgtatgag 
gaaatgtttt 
atagttcacg 
gtcacatgta 
aggtttagct 
ttgctctatg 
gagcaaacaa 
tttgagtgtt 
aaagatactt 
atattatctc 
tatatctgag 
atttaaaatc 
ttatatgtaa 
tacttttatt 
ttcatctctt 
tagaaaatat 
agctgttaat 
aattggtaca 
gctatgctat 
atatgttgat 
gacagtgctg 
agtgtgctgg 
ctaattttta 
tgcttacaag 
aatgccagtt 
gttcagttgg 
tagacatgac 
tttcacattt 
ttgtaagcac 
tctctcccta 
ttcctagcct 
cagtgtcagg 
taaatccatg 
cctccatgtc 
gcttgactgg 
catgcataca 
tttattcaca 
tggaggaaga 
aacttttctc 
gtatttggct 
tggacccatt 
atagttgcta 
atttaatgat 
gtgaataaca 
aataagctac 
tctcaaacat 
ataggatatt 
agaaaaatgg 
gaacaaatgt 
ccaacttctg 
cactctcgct 
acttggtggc 
cttactcagc 
atcttgttct 
cctgtgcttg 
tcatctggaa 
ctcactcgct 
tgcagcttgc 



agcttttgaa 
atcaaatgag 
tctaaaaatg 
tcagcgtttc 
tagattcttc 
ttatgttcat 
aataaatata 
tttttaggtt 
tcccacagat 
aagtcctttc 
tgaacaaata 
ttgttttaca 
ttaggaaatg 
aaattaagtt 
tatctaagta 
tggagcttct 
agctgtatgg 
aaaagaacat 
gtcaagtttc 
gtcttcaaca 
ccaataagct 
gttcttagag 
agtcactcac 
tctaatggaa 
ctcctaccaa 
attttattta 
agcagatatg 
tgtttttaaa 
tctccccatc 
gttctgtcat 
attaaacaga 
tttttccagt 
gacttctgtg 
cctttccagc 
ttttggtgtt 
gctttagcca 
tctctgccta 
cccaaaagga 
ctcatatttc 
aaaagaacaa 
agctgttcag 
agctgttcga 
ttctaatggt 
aaagtaagcc 
gcgaaggtga 
gtgtctataa 
ctggcgattg 
aaagcaattt 
catatgaaca 
aatatactta 
gaaaagggca 
ttattctttc 
agaaagcata 
cttaagaagg 
ttaaaacacc 
tgaaatcaag 
cctgtacggg 
gttcgaggtc 
atggcactgg 
ctaaacctgt 
agggccttct 



tgtgaacatt 
gtacttttta 
acgtattaca 
atggcctatt 
atggctgacc 
acttaattgt 
cattttctcc 
aacaataaca 
gtatcttgtg 
agctagtcag 
attttcaaag 
gttgaaggaa 
aagtttcttc 
gaagtgcttg 
catgattatg 
gttttgtaag 
ttgcaacttg 
atgccttata 
tattggagaa 
atggtaatgt 
atgcaattta 
gtgtctttga 
tgagcaaata 
ctttctgcaa 
tgtggttttt 
attttagttt 
tcataggtat 
cacatgtcct 
tgccagctct 
ttcagttgct 
acaaaacaaa 
actcactctg 
gggctgttct 
ctctttctgc 
tgattaattt 
tcgtttcctt 
actctggacc 
tgtctcaaac 
ttgtcttggt 
attgatagca 
atggggtcct 
agggttttgt 
tataagggat 
cctatggtgg 
catacttaag 
tggcaaacca 
tagagatttg 
ttttctttgt 
aatttagttg 
atattacaaa 
tgaataagaa 
tcataatcaa 
gcaaacaaga 
tgtgtttgct 
acagatttct 
gtgttggcag 
gtcctgtgct 
ctgtgctggg 
ctcgcccaca 
gtttttggct 
gcagctcttg 



34200 
34260 
34320 
34380 
34440 
34500 
34560 
34620 
34680 
34740 
34800 
34860 
34920 
34980 
35040 
351^00 
35160 
35220 
35280 
35340 
35400 
35460 
35520 
35580 
35640 
35700 
35760 
35820 
35880 
35940 
36000 
36060 
36120 
36180 
36240 
36300 
36360 
36420 
36480 
36540 
36600 
36660 
36720 
36780 
36840 
36900 
36960 
37020 
37080 
37140 
37200 
37260 
37320 
37380 
37440 
37500 
37560 
37620 
37680 
37740 
37800 
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tctgccccag ccccggggtc tgcccatccc agtgctgggc tgttctgttc ctgccctgcc 37860 

tttcctcagc ccttggcaac cctgtttgtt ttctcccttc cttagcagtg gagaacatcg 37920 

taagatcaat gctgactgcc ttctgcagcc aagccaggcc atttcatttc agccgagcca 37980 

agtctgtgtg gagcagttct tttatttttc tccttttgac tacctcatgg ttttcacgga 38040 

tttttgttct cttcacattc aaggattttt tgctttcaga aagttatatt tctctggaaa 38100 

gagtgcaccc aafcatccctt ttgatttcaa aatcttaatg tggagtctct tgacttggat 38160 

ttctttggaa gaaactgctg aagctgccat gtctaagaag aaaactttgg agaaaaattt 38220 

tcttcttaga ca^ggcaacg tcaacagttt ctaagctctt gattccgtct accctgtctc 38280 

catcgttgcc tc^gtcatct gccttacttc tctgcagggg tttctcccag cttgcaaatg 38340 

tactccaatt ctgaaataac taagtctata gctgtgcaaa gagaagtctg ggccccttgc 3 8400 

tttcttgtgt ttgfactccat ccactctcca gaaatgaatc ccacttctca cttaaccact 38460 

gacctccaaa gchtcgtatc atttgtgtca gttgtcatat ttgttaactt tcacataact 3 8520 

tttgacatta ttf:atacctt tataaccagg aaataatttt aactttattg tagaaataaa 38580 

caatggagta taatttttct tgttgaagat aaatatcacc tcctcttcct ttaaacatct 38640 

cttccctttg tttttgtatt acattggttt cccccctttt tttatttcct gggttgtcgt 38700 

attccctgtt attattttta cctttttttt tt'taatgtgg atgtttccgg agtctgtatt 38760 

tcttgccttt tcatcttctg ccctttatta ttctcagcca ctgccattac ttcagttatc 38820 

cattcccatg gtttccacat gcttagcttc ggttgattct tgccatttta cagaccatat 38880 

ttccaactac ttctagaatg ttttgttcct tcagcctcag tatgcccaat ttgaactcat 3 8940 

gttctctctc ccccttcttt cttccttctt tctttcgctc tctctccctt ccttcttttc 39000 

tttccctccc tccctttctt ccttccctca ctcgttctct cttgcttgct tgctttctct 39060 

cctctctctc ttttctttct gcnnnnnnnn nnnattcttc tccctccctc tcttccttct 39120 

ctcccccact ccccaacttc caggctaaag cagtcctcct gagtagttag gactacagac 39180 

atacacgtgc caccgcgccc agctccgtgt tctctttgtt tccctgcctc ctgctcttcc 39240 

acttatcttt gcatggcagg tgggtgcacg caggcatgct ctgcatgtct tcctcttggc 39300 

cattcccctt ctagttatgg tgtggcttta tctacgcgtt ctggagcaga agcctagtca 39360 

caaagctatt tttttaaaac attcatgata attcatttcc ttttatgttt taaaaatact 39420 

agctttctgt ctttatttcc ttactaactt acttggatgc cagtaattag ttgttttagt 39480 

gaacaccaca gagtgatatt ttgaaacttt ggacttcata aagttggatg agctccagta 39540 

gcaaagaagg aagtgttaac tagtttaact gacaaataaa tgcttcccag cttggtgtgc 39600 

gattgagatt ttfigttgcaa gtttgtgaat caatttaact gcccctgccc tggggactaa 39660 

agtcagatac gt^cttgtgg gaatctttgt ctttcccaca ccaccctgca ttttaaaacc 39720 

tcttgtgtgg gapagtccca ccatgtaata gctgttcttc cttactcagc tactttccct 39780 

ccagagaggc cagtagaaaa tctagactag ttttttatag tctattttca tgtcacttat 39840 

tgagagctac tgttttctgt taaattgtca gtaaatattt taatcaagga aaagggaggc 39900 

aataggaagg ag^gaagaac aaatccttaa ccctagtagg aacctaatga atgggatttg 39960 

ttctggataa tt^cagtagt cccccagcta aagaaccttt taaaaatatg tcagatatac 40020 

ccaagaggat tgaaatcgta tgttcataca aaagcttgtt cacctgcagc cttcatatgc 40080 

aattcctatg aatgttcata gcagcattat tcataatagc caaagtatgg atgcaaccca 40140 

aatgtccatg aagcaattaa taggtaaaca aaatgtgatc tgttcacaca gtggaatact 40200 

aactattcag ccataaaaag gaatgaagca ctgagtcctg cagccacaca gatgaacctc 40260 

agatccatgc tgagcgaaag aagccagaaa caggaggcca tgtgctgtgt gactgtattt 40320 

ctaggaaatc ttgagtcacc atgggcaaga tgctatcacc tttgttcagt ggccagaagc 40380 

gagggcacta atatttaccc ttgccggggt ctactagatt gaagcgtttc cgctaggcca 40440 

taaacttcca acacggtgac ttgtacatgt agatatttga tcaatatata gcaaatgaat 40500 

attgatttaa acagaaaaag gcaagtgaga gtgctttcta aacttagagc cctaaatata 40560 

tgaggttgtg gaattaatag attctgttgt gtgtgtttga gggaatttaa aaataattta 40620 

gatgttaaac agtatattgt ggaggtgttt tgtaactaat taatgacggc actgaattga 40680 

cttctaggcc ttgcagtatt aaaacatgtg ctaacaccac gaataaaggc aactcacgtt 40740 

gcttttgatt gcatgaagaa ttatttagat gcaatttatg atgttacggt ggtttatgaa 40800 

gggaaagacg atggagggca gcgaagagag tcaccgacca tgacgggtaa gtgtgttcac 40860 

gcacctgaaa tgcctgtaca cggtatatac agtgcacatg tttatgtaga attcagtttt 40920 

acaaagtagg ttaagtgtac ttttttcctc cattacattt acccggtata tttttcaaga 40980 

tgttattaag atgtaacagt ggagatttca ttagtcctgc aaagtgtggt atttcttggc 41040 

tgtcgtgtga gtcctgtgga ctcaccaatt atcattaatc cagcctcttt ctactcaaag 41100 

ttcacactta aaaggaaagc tctgtaaaag ggaggaagac gtgaagaagg agcacgcctg 41160 

gcagtactga gt^cacgtta ttagtcagtg ctgccctttt gctgtatttt tcgtaaaata 41220 

tttattaaat ttgggtgtca ttgtgacaag aagaaatgca gttaagtgtg accttttttt 41280 

ttccccaaac atgttaggtt ttaagaacct ttgagctatt gtcagatata accagaaaaa 41340 

aatagaattt ta^gtgagca ggataactta gttaaactaa ccaaacatag tgttagctgt 41400 

tagagaaatg taaacatgga aataggcaaa cagggaagtg tgtggagttt ctgtttcctt 41460 
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ttcaaaatat 
gtttttcgtt 
ggcgcaatct 
gcctcctgag 
ttagtagata 
gatccacgca 
gttgagccac 
caaaatatga 
gttgatgtgc 
aagtagacat 
tacacgtagg 
tgtcatcttt 
ttagtctgtt 
actgaaagag 
gcccaggggc 
tttatatctg 
gaaaactaag 
aaaaattaaa 
ctggccttcc 
gtgcaaattt 
tattaatatt 
aacagacagt 
gtttataacc 
ttttatcatg 
cagcctctgt 
gcttcctcct 
cctgcacccc 
ggaaaccacc 
cggcaccttt 
tggcactgat 
tcccagccgc 
ggttccaacc 
gggtggtcac 
ctgacttacc 
atttctgaat 
aattatttgc 
acaaattgat 
acatttcaca 
cctttttttt 
gtgggtgaga 
gtacaaacct 
ttttagattt 
agatataata 
aactctgctt 
tttatatgaa 
ttaaattata 
ctgtaagtac 
ctatttaata 
attgttgata 
gaattagctg 
ggacgctctg 
ggaattcttg 
cttgtccttc 
cctgggcatt 
agacgaaaca 
tgttgtaatg 
tgtatgatcc 
catttctatt 
taatcccagc 
gcgtgaccaa 
tggtaggcac 



ctgtttgagc 
ttittgttttg 
tgactcactg 
tagctggaac 
tgggatttca 
ccjtcggcctc 
tajtgtatgtc 
gcbtccagca 
cttggaaatt 
tattaaagca 
ccttctagac 
ggatgttagg 
aaataatgat 
agagcagctg 
tagctagtat 
aaagatgatt 
gtgtttttca 
actgtacgaa 
cttctccttc 
aagtatatct 
tttctctttt 
tcatcaagat 
aatgaaaaac 
tcccctcctg 
tgcctgccat 
gtgctgtgcc 
atgcatattc 
ctctcttttg 
ccpagcatag 
ggcgtgtgtt 
atcataagtg 
tcccactgcc 
tgpctatgtg 
cagataccat 
taaaggaaaa 
aacctagatc 
aagaagatga 
tttaacctca 
tttttaatcc 
tttaagtatt 
ttgtagaagt 
agcattcccg 
ataacacgtt 
tcattgtttt 
ttgatgggtg 
tgacatctga 
tttgtggcca 
ttcatatttt 
tgccatggcc 
cagtgattgt 
caggtgagag 
tagtcacctt 
cc^ttttatg 
ctctcatggt 
gtgtagtttg 
taataaattg 
tagttaagtt 
aatattttta 
attttgggat 
cacggtgaaa 
ctgtaattcc 



tggggttgag 

agacaagagt 

caacctccgc 

tacatgcgtg 

ccttgttggc 

ccaaatgagc 

agtgtgcttg 

cgttttacat 

ttatagagta 

ttcagaagtg 

gtagtactgt 

gatttttcca 

aagatgaggg 

tctactgcag 

aaaaattggt 

gttctcataa 

ttttagatgt 

atgcacagtg 

ctagcgataa 

tcttattcta 

aaactatcga 

tgtcgttggt 

agatagactc 

tctaagaacc 

cggaggaatg 

gtgaagcctc 

agtagttgaa 

ttgccctcat 

cactgtgcct 

acagtgctgg 

ccttgaggaa 

ctgcttatcc 

tattcattac 

aaagaaaata 

atacaccaga 

atagaaaagg 

ctgataacta 

ttcatgataa 

attagattgg 

atcaggcatt 

tgctttggaa 

ctttctgaga 

tttccttcta 

tttttgtttt 

ttctggtctg 

tataagttgt 

catttcatta 

atgatgcaat 

cagtgtttct 

tgaacatgca 

ctgggaagct 

catgaggtct 

gtgtaagttt 

tcactgcttc 

attactattg 

tgtgcttaag 

ttttctacca 

tatttaaagt 

gctgaggcgg 

tcccatctct 

agctactcag 



agagaacact 
ttcgctctgt 
ctcccacgtt 
tgccaccatg 
caggctggtc 
tttgtgtttt 
tatcagtagg 
ggaaaccctc 
atatttttaa 
agcaaggata 
gcaccgttac 
aagttcagtg 
tcactcaggt 
aaagttaggg 
tatggtcgaa 
ttgtatataa 
aaatgtttag 
aaacgtcttc 
ccagttttct 
ccatccctcc 
aggagttact 
ttattaaaca 
cccataataa 
ccttggttca 
cgttccagcc 
ggccgtggtg 
ggctttgtgt 
ccaaggctac 
tctcctgccc 
cacttagcac 
gccaaaacct 
tctgctacat 
aaattgtctc 
aaatcttatc 
gtaaaatcaa 
ggtcatttcc 
gaaagaaaaa 
ggtaagtgca 
caaacatccc 
tttatacttt 
atgtctctca 
cattattcaa 
gtgtgttgct 
ctgtcactgg 
gttataatct 
gttaggtaga 
gtattaaata 
taagaaataa 
caaagcattc 
gggcctctgc 
gtagaagctg 
tatgttgagg 
cattttaagg 
ttgtaatcat 
attttttttt 
gacaaccttt 
gtattttcat 
atggaggccg 
gtggatcaca 
actaaaaata 
gaggctgagg 



aggcttcatg 
cgcccaggct 
cacacgattc 
catgactaat 
tcaaactcct 
tacctcatca 
atctactgag 
acctgaagca 
ctacaacaaa 
gaaattattc 
attatctaac 
agattatagt 
tttaaaagaa 
agggaggctg 
ggaaaaaaaa 
cacagagtaa 
aatatgtaat 
cttgctttcc 
taatttgttg 
cttcttacag 
tacctatttt 
tagtttaaga 
ccttgtttaa 
gcagagctca 
gtgatctctg 
aagctggctg 
ggccaatcct 
tgttctccca 
ctgctcttgc 
agggctctgc 
tctgtgagtt 
gtgagctgac 
cttttgaaag 
acttcagtca 
gactgaaaga 
ttcttgcgta 
tgggtaaaga 
aatgaaaact 
aaggtttgat 
gctgttagga 
gatgtacaaa 
catgtatacg 
tttaacctgt 
ctcagccctg 
actttagttt 
aaattctgta 
ttatctctat 
tttttttctg 
tgggggatca 
tccactccac 
cagtgctaac 
agaggcagcc 
gaggtataaa 
ggaagatgtc 
aattattttt 
ggtattctat 
attacaacat 
ggcacagtgg 
aggtcaggag 
caaaaattag 
taggagaatc 



gggttttttt 
ggagtgcagt 
tcctgcctta 
atttgtattt 
tacctcaggt 
gctgtttggg 
ggcagatgtt 
ttcgtctgaa 
acatttataa 
tgcccaacct 
actgtctgtg 
tgtcaaatga 
aagctctttg 
gaggagtgag 
atgtaacata 
ttgtaaagta 
gcatcagttt 
accctgctac 
tgcgttgtat 
aaaagtggca 
tgcatttcaa 
ttaaacaagt 
atgctgctac 
tgggtaaggc 
ccttgccttc 
actgagtcct 
gctttccaca 
gagtgacagg 
agtactgctg 
ctttctctct 
gcattgcctg 
tgtggctttg 
attgaccttt 
aggataaagt 
caaactggga 
aagtgcactt 
acaacaatag 
acaggggata 
cataggctca 
atgcaatgta 
tgcattcaca 
tgtgcacata 
agcttgaaaa 
ctttcaattg 
aagagtcact 
acttggaata 
atatagtagg 
aagttggtag 
ctgtttgtca 
gttgctacca 
aaatgctaca 
agtagtgtcc 
tcaaagccca 
attgcggcag 
ctgaagtggc 
ttgagtattg 
atttactttc 
ctcacgcgtg 
ttctagacca 
ccgggcacag 
acttgaatcc 



41520 
41580 
41640 
41700 
41760 
41820 
41880 
41940 
42000 
42060 
42120 
42180 
42240 
42300 
42360 
4^420 
42480 
42540 
42600 
42660 
42720 
42780 
42840 
42900 
42960 
43020 
43080 
43140 
43200 
43260 
43320 
43380 
43440 
43500 
43560 
43620 
43680 
43740 
43800 
43860 
43920 
43980 
44040 
44100 
44160 
44220 
44280 
44340 
44400 
44460 
44520 
44580 
44640 
44700 
44760 
44820 
44880 
44940 
45000 
45060 
45120 
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crtt^Tlt^ agttgcagtg agctaagatc gtgccactgg actctagcct ggctgacaga 45180 
mialtlr Srcctaaaaaa aaagggatca gggaagaggg gattacagat aacccaaaga TslTo 
agaaggaaaa atctccacaa gttcacctgt ccagcggtaa ccccaatttg gatattttcc lUtn 
tttaacaatt tggatatttt cctttaaatc ctctttttta taatgtcta? a?g?tgSaga lllTo 
llTttlllTt tttct^SS ^ttttaaaga tgagatttct gtgt^tgtct llltcllTg llllo 
lltllt^ltt ^^^"^^'f^^*' gttataaaca gctgtacatg tcagtatata tacttccgta 45480 
a^^t^^^^^^ aaaggctata tagtgttcat tgatgtgatt taacagcagt tatctccccg 45540 
ftn^t gttggaatgt gggtcctgtg tgttgccttc agagcaaatg gggcttggt? 45600 

calTtftnt^ tagacctgtg acctgtacga atagttggaa gactttctct attaccclag 4566? 

gag?att?tt tcctaaa?^^ cctactagaa atttatgggt agaaaaacaa taatatctti 4552? 

tnflt^tll tcctagattc cctaaggtgc tatagggtga tttttactca tgtaacatga 45780 

ff^'^f^^^'^^ gtttttgcaa atgtggatat ataagtactt tlttaaac?? 45840 

tttataccac ttatttcctc ccttcagtgt tagaacctcc taaatggSt 45900 

tllllllltn ^==^5Ctttcc actttgtcgc atgctcctct cattgtccct acctgggtcc 45960 

tgaaccttag ggacttggct gttatagccc caccatggct acgctgggcc ttggEcatct 4S0?n 

ctgagactta gtttcttcaf cttacaagga gataataLa gfeccctSct gcg?aSIa?t iloll 

ItlTdTatt ?f ttaacatact caaaagcatg ccgtaaa^c atStgagca tUTo 

catgtacgtt tt^ggaaaaa caaaaggacc catgcacatt tcggagtgct tttqtctcaa 46?00 

Itltttl^'''' tcttcttcca aagctgacgt cttagtagag gccctgccac g^cctgagcl ^6260 

acaatctcta iS^altlllt ^^"^f attcgaaatg cagtctgttc latcttcct? llllo 

Taltlllalt Ittrrtl tgaaataccg ggtatctgca gtgttgacca ggtgattact 46380 

taattatgga aatgttgagg tggagatcta gataattcag tgaaggcagg aaaattggtg 46440 

tcggaatctg tctttttatg tgtcagaaat agaaataaga taggg?ga|a agtaat??g? lesJS 

ggctaaaaca ctataatagc taacacatag tgcatactgt gtgccalgca c?cc?gtagg tsllo 

tgcttgaaat cttctattat tattatccct actttataga cttgcaccct taggcacaga 46620 

l^llllTr flt^T"^^ gttaccccag aggtggagat ccaggctacc tgactccacc tstso 

atgtgtgctc ttbcctaggg cacagttgtg ctgctaaaaa tactttttaa gcagttcttt 46740 

??tattttaa tgtaggaaaa ttaagacaaa aataatgaaa LtLaaatc tlloo 

tttattttag tgttttgcac atgtattatt aaagccagtt tactcctgga agtgtgtaag 46860 

aatacagggt atttttgatc acctaaatgc tgcatgttac taagagcfcg acactgaag? 46920 

ca^f^f ^^"^ agttgcagag agtacttagc aaaaacggga agtgtgtggg gttgSgS 4698? 

caaagacaag tcttcctcgg acggtggagt gtagaattca tcatttctca gaacacgSt 47040 

ttgaacgcat tttcaatttg aggccaaagg tctcagcctc ccactcggca tacctcccta 47100 

taacaSctt f ^^^'^ ^^^^^^^^^^^ ttctttgttc ttcaaggaac tt^aatatg? tVlll 

lltl^llltl acctgtccac agggagcccc ctacaaagaa gggagtttct agtctccgtt 47220 

ccttttS^^ ataaataata gcctcatacc ttgtgcaatc gaggctgaaa aagactgLt 47280 

ccttttttca aataagcaag tcttagaaac tacagttgtt tacagggctc atggctattc 47340 

Itlttln^^^ attttggttc ttttaccaat tatataatat gttaaStat ggllagtatc 4?4o2 

tgSccccca lllllllltl T.lltT^ ^caatggcca agttagagag Jfggg^caat VlTo 

ItnZl^T *9^ttgttgt ggctgtgtag cagtcagtga cgagaagctg tgtgtcaggc 47520 

lallTr-tt^ gttgaggatt atcaggcgcc tgtgagtgcc cagctgtgt? ccaggtcigg 47580 

td^lnnt^ gtgagccaga ccagcttcct ctcggcccct gtggagctcg cag?ctgg£g 47640 

gggaggcagc agtcaccatg gtgacaggtg acacactagg atggggctgg tggtggtagg 47700 

cctaS? gaggtgagta tggacttaga ggaggctcll gcltlltlll lllSO 

ctatagcact aaaagttgtc acatgaaaaa taacatttgg tactattgat 47820 

ITalltlT ^^'=="^'^9ta attgtagttg acttagaaat tataacatgl tcttctaSt 478^ 

cagcttgaaa ccpccaacca ccagtttata atcctttttt tttaactttt gtttattttt 47940 

cattt??act acaacttttt ttgtcctgtt llllllllH tllti 

c?ttottaaa ^^^^^^^^'^^ aaatagtaaa aaaaaagaat tatttttgtt 48060 

an^^f! ^ at^tctctgc aaagaatgtc caaaaattca tattcacatt gatcgtatcg 48120 

aaatcaaaaa 1^^^^^^" gaacaagaac atatgagaag atggctgcat gaacgtttcg tlllo 

tttcagtata attlJ-Trt f'^^^*^^^"^ cagcacttcc ggaacttcgg ttcaactaga 48240 

tt^atS^^ tgaaaccaat gtaaatggtt atattgtctc aagaatacat 48300 

aaS^^^o^= ^^^^^^^^f^^ tttatgcatg tctgatcgtg ttttaaactt tacttgtaca 48360 

gacaaScta tnt^T^^^ ttacagtggg cccatctact tgcattgata gtattLttg 48420 

actttcnf^f cgtgataaca tagcaaatta aattaaaaac aaoaacaaac acacaaaaal 48480 

^oa^^^^^^ gtcagatgcc cggacctacc tgtcaggtca cataaagtgg tgttactgtg 48540 

aallftllll =^^"^5?== agtgtgcgca gaaaagcaag ggaggggtag aggacta?gc Jseoo 

gcgt?2atc caaaat^tc^ f ^ff^ tttgttggaa atagaagggg gcagttgaca 48660 

^ = ^ caaagtgtct tctgtggtta attatattca gaaattttag ccaattgttt 48720 

tattctctaa atatgtactt tctgctcaag aaactatcat tgttcttct? ttccttgttt 48780 
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tacagtacag tgtttttaat taaccctcct gggttaactt taccaggtga aaatgattaa 48840 

aagtgtaata ggttaacaat gaaactttaa gcttctattt ttcattgact cttaactgta 48900 

catgatgtaa tgtattcagc gagccattca ggaccacttt ggcccatgga agaaatttaa 48960 

aagtaagatc tacatgtatt gacatgaaaa tatgttctca gaaaaaagac taatgtattt 49020 

aatgtcctac ttattttata agtatttaga atacctctgg acattttaaa acaatgatta 49080 

ttgctagggt gtgtgattta taaagcaata gaagcgcttt ccctttctgt ttgtgtttta 49140 

gattattata tcgggtatgt tctgctatca taactttaca aatcttatgt aatatgggaa 49200 

aatgagttaa ctatgctgtt ttccttcttt tacctgcctt tctaattctg tgggaataaa 492 60 

ggcgtttttg agacagccca ggtgtagtga gcagtccata tccatggatt ccacattcat 49320 

ggattccacc aagcacagac caaaaatact cagaaaaaaa gggggctggc tgtggtggct 49380 

catgcatgta atcccagcac tttgggaggc taaggcaggc aaattgcttg agcccagaag 49440 

ttcaagacag cctgggcaac atggcaaaac cctgtctcta cagaaaatac aaaaattagc 49500 

caggcgtgca cctgtagtcc cagctactca ggaggccgag gtgcgaggat cacctgagcc 49560 

tggaaggttg agactgcagt gagctatcat tgtgccaact ccagcctggt aacagagtgc 49620 

cttttttcaa aaaaaaaaaa aaaaaaggat ttgggaggat atgcatatgt tatattcaaa 49680 

tacatgccat ttf:attcata tatcaggga^ ttgagcatcc tttgatcttg-gtctctgccg 49740 

ggtatcctgg gaccagcccc ctgtcgatac agagggaccg ctgtctaaga accgctggtc 49800 

ctatctttga ct):ctggcgg aataggagct ccatgtaaaa aggaggagaa gctgcagcgg 49860 

gttattagcc atttgtgagt caggtcactg taaaacttta tcaaaagttt aaaagacaaa 49920 

aagcatcctc at^iaaatgcc ttaaaaccac ctgttgaaat attacatata caattcatgt 49980 

atactaatca tagagcatat taaagatatt ttagaagact agaaacttct attaaaccaa 50040 

gtttctggat gtttccgtat tcatccttat tttccaggga cctgcataac ttttccagcg 50100 

tgtaatagct acctgattga tattttttga attgaaatac tgaagtgact aaaatctaaa 50160 

ctttttccat tctggccata ggatgcttat agaattttat gagtcaccag atccagaaag 50220 

aagaaaaaga tttcctggga aaagtgttaa ttccaaatta agtatcaaga agactttacc 50280 

atcaatgttg atcttaagtg gtttgactgc aggcatgctt atgaccgatg ctggaaggaa 50340 

gctgtatgtg aacacctgga tatatggaac cctacttggc tgcctgtggg ttactattaa 50400 

agcatagaca agtagctgtc tccagacagt gggatgtgct acattgtcta tttttggcgg 50460 

ctgcacatga catcaaattg tttcctgaat ttattaagga gtgtaaataa agccttgttg 50520 

attgaagatt ggataataga atttgtgacg aaagctgata tgcaatggtc ttgggcaaac 50580 

atacctggtt gtacaacttt agcatcgggg ctgctggaag ggtaaaagct aaatggagtt 50640 

tctcctgctc tgtccatttc ctatgaacta atgacaactt gagaaggctg ggaggattgt 50700 

gtattttgca agtcagatgg ctgcattttt gagcattaat ttgcagcgta tttcactttt 50760 

tctgttattt tcaatttatt acaacttgac agctccaagc tcttattact aaagtattta 50820 

gtatcttgca gctagttaat atttcatctt ttgcttattt ctacaagtca gtgaaataaa 50880 

ttgtatttag gaagtgtcag gatgttcaaa ggaaagggta aaaagtgttc atggggaaaa 50940 

agctctgttt agcacatgat tttattgtat tgcgttatta gctgatttta ctcattttat 51000 

atttgcaaaa taaatttcta atatttattg aaattgctta atttgcacac cctgtacaca 51060 

cagaaaatgg tataaaatat gagaacgaag tttaaaattg tgactctgat tcattatagc 51120 

agaactttaa atttcccagc tttttgaaga tttaagctac gctattagta cttccctttg 51180 

tctgtgccat aa&tgcttga aaacgttaag gttttctgtt ttgttttgtt tttttaatat 51240 

caaaagagtc gg|:gtgaacc ttggttggac cccaagttca caagattttt aaggtgatga 513 00 

gagcctgcag acattctgcc tagatttact agcgtgtgcc ttttgcctgc ttctctttga 51360 

tttcacagaa tattcattca gaagtcgcgt ttctgtagtg tggtggattc ccactgggct 51420 

ctggtccttc ccttggatcc cgtcagtggt gctgctcagc ggcttgcacg tagacttgct 51480 

aggaagaaat gcfcigagccag cctgtgctgc ccactttcag agttgaactc tttaagccct 51540 

tgtgagtggg cttcaccagc tactgcagag gcattttgca tttgtctgtg tcaagaagtt 51600 

caccttctca agpcagtgaa atacagactt aattcgtcat gactgaacga atttgtttat 51660 

ttcccattag gtfctagtgga gctacacatt aatatgtatc gccttagagc aagagctgtg 51720 

ttccaggaac caigatcacga tttttagcca tggaacaata tatcccatgg gagaagacct 51780 

ttcagtgtga actgttctat ttttgtgtta taatttaaac ttcgatttcc tcatagtcct 51840 

ttaagttgac atttctgctt actgctactg gatttttgct gcagaaatat atcagtggcc 51900 

cacattaaac ataccagttg gatcatgata agcaaaatga aagaaataat gattaaggga 51960 

aaattaagtg actgtgttac actgcttctc ccatgccaga gaataaactc tttcaagcat 52020 

catctttgaa gagtcgtgtg gtgtgaattg gtttgtgtac attagaatgt atgcacacat 52080 

ccatggacac tcaggatata gttggcctaa taatcggggc atgggtaaaa cttatgaaaa 52140 

tttcctcatg ctgaattgta attttctctt acctgtaaag taaaatttag atcaattcca 52200 

tgtctttgtt aagtacaggg atttaatata ttttgaatat aatgggtatg ttctaaattt 52260 

gaactttgag aggcaatact gttggaatta tgtggattct aactcatttt aacaaggtag 52320 

cctgacctgc ataagatcac ttgaatgtta ggtttcatag aactatacta atcttctcac 523 80 

aaaaggtcta taaaatacag tcgttgaaaa aaattttgta tcaaaatgtt tggaaaatta 52440 
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gaagcttctc 

tatacctacc 

ttacagctta 

ttttcagaaa 

ctaatggctg 

ataaaatagg 

caaagtttta 

tatataaatt 

tgtaatcatt 

aataaaagtg 

tcatttccat 

taagctttat 

gtgatctaac 

ggcaacatat 

gtacactgag 

caggaggttg 

cacgagaccc 

atagacacac 

ctaccaaagc 

aatttgtttt 

ccgtatctgg 

aacaagattt 

gattaaagaa 

taacatgcaa 

tacttgtcaa 

cgtctgtaca 

caatatttga 

gggagtcata 

atagctggtt 

agctgaagta 

atgtttattg 

attttatatg 

taaaaaaggt 

tttgtctctt 

cctcattaat 

ttcttgtata 

tattaaatac 



actaagcctc 
tagcttgacc 
accaacagtt 
acatgcataa 
cagtttggga 
caacgtggtg 
ggcacctgta 
gcggaggttg 
gactccgtat 
ccttacctgt 
cactagtttg 
gagattatta 
gtaattgtcc 
tgttgctaaa 
cagagttact 
gttttcattt 
ggtctttcag 
gccatcccca 
catttgttta 
atgtatttag 
tggagtacag 
atcctgcctc 
ttttttgtat 
ctgacctcag 



cttaacctgt 
tgtaagtctt 
gttattaatt 
agggtgtgtt 
tgctgtttaa 
cttaatgact 
aaaaaataca 
catctgcaat 
aagtgatctc 
cc^atctggc 
gtgattttta 
ttbttccttt 
tagactgatc 
cgagaacctg 
tagtttgtcc 
agattgcagt 
caactcttag 
agtaactccc 
agtattttgt 
tagtagtgtt 
tggaaagcta 
tttctccctc 
cttgaaatta 
tcattagaat 
atattcatgt 
atcgctaatt 
catcttttcc 
tatgaggtca 
atcctgagca 
cttctaatat 
atgttgatga 
tgaatatgta 
gctattgaaa 
tajtggtattt 
tcagcaataa 
tgtgataaag 
tapgtcatat 
ttgcgggcat 
actaaaagta 
accttatcaa 
ctttaaataa 
ggccgaggca 
aaaccctgtc 
atcccagcta 
cagtgagcca 
taaaaaaaaa 



aaagagggga 
agattctagg 
ttcttgctaa 
attattttct 
catctttctg 
ttctgtctgg 
gcccatgacc 
tttattgcag 
tgttttatca 
tataatggaa 
attctttttt 
tggctcaatc 
agcctcccaa 
ttjttagtaga 
atgatccacc 



attgatactg 
ttcacatatc 
tttctttata 
tggatgaaag 
cattttttga 
ggccctgcat 
ctaaaataat 
ttataagatg 
agtgaaacat 
taactcttac 
aaatttagag 
gttcataatt 
atagatagaa 
tctacaaaaa 
cagctactcg 
gagccatgga 
aaaatgaaaa 
agatatgtac 
gtgtataatt 
tagattgaag 
caatgcaatg 
cttttgggcc 
cgttatcact 
caaaattagt 
aattaactga 
tactcagttt 
aatttgtgta 
aagacatata 
ggggaaaagg 
actgagggaa 
aacagatcag 
agatatgttc 
ttctgtgtct 
tcagaataaa 
aggaaaatat 
tttacatgtt 
tttaaaggtt 
catctcatct 
atgcatctgc 
ggtaaatccc 
atttcctgcc 
agtggatcac 
tctactaaaa 
cttgggagga 
agatggcgtc 
aaaaaaaaaa 
gaatatgtat 
tataaaaata 
accaattcag 
gaagtgtttt 
gttattcaag 
gtcacaggtc 
acctttattc 
agaagatggt 
ttggctagag 
acaaagatga 
tttttttttt 
tcggctcact 
gtatttggga 
gacggggttt 
cgccttggcc 



acttgaatta 
atttaaactt 
agaatgccgt 
taaaaaaaaa 
ccctaaaatt 
tcttcacaat 
caaaactgtt 
catggccgat 
gtcaaatgcc 
accatacata 
tggcaacaat 
atattctttg 
ggaaataagg 
aattaaaaaa 
ggagggtgag 
cataccactg 
ggaaatatag 
cacaaaaaat 
gcaagcgcat 
attgagtgaa 
tcgttgtagt 
agttttcatt 
tagtataatt 
actttggtca 
atttaaaacc 
agagtagcta 
tgaaaagtaa 
ccttgttatt 
ttatttttag 
gtataatatg 
tttttccatc 
tgcaatttta 
ccagcaggca 
gtctgacttg 
gcatctcaaa 
gtgtatatat 
cagtttgtag 
cactgtcatc 
aagcatactg 
agactctaaa 
gggcgcggtg 
ttgaggtcag 
atacaaaaat 
tgaggcagga 
attgcactcc 
aaaaaaaatt 
ttacttcaaa 
cattcttata 
ttttatttgc 
ggaactcaac 
ctcgtgtata 
agttcttgat 
tttttcctca 
ggagaaaagc 
tggaaaatag 
ggaaagaacc 
taagacggag 
gcaacctcca 
ttacaggcgt 
cgccatgttg 
tcccaaagtg 



ttttctaaaa 
ttgtttgtat 
cgatgtgcat 
aataaaatct 
caccaacagt 
atttttccct 
aagcagtata 
gttaatttgc 
ttaaattaac 
ctgatagttt 
tttgcttaat 
aataggtctg 
ccaagttcaa 
aattagccag 
gtgggaggat 
cactacagcc 
aaatataaaa 
gtgaaaagag 
agtaaaataa 
atattttctt 
tttgcatggc 
acgagtaact 
gacattatat 
aaatatttac 
ttcaactatt 
caactcttcg 
atctattcct 
ataatatgta 
gaaaaccact 
tggaacaaac 
cggattatta 
taaatgttca 
agaatacttg 
tgtttttgag 
aattggtgat 
gttgtattgc 
tgatagtaaa 
acaaacccca 
ccaggttttg 
agagttggtg 
gctcacgcct 
gagtttgaga 
tagccaggcg 
gaatcatttg 
agcctgggcg 
cctctcctgt 
gagttcaggg 
taattttaac 
tgtctaaaat 
acatgattgt 
ctgtgctctg 
agttttcgga 
actgcaccca 
cggaattccc 
cagtaactac 
tggcttagat 
tgttgctctg 
tttccctggt 
gttccaccac 
gccaggctgg 
ctgggattac 



ttaagagccg 

tattactgat 

gcttttatgt 

ttcactgtct 

ctcccagtac 

aagctttgag 

ttagtttggt 

ttggcaattc 

taagttggtg 

ttcatatgtt 

atgggttaca 

tgtcaatcaa 

gaccagcctg 

gcatggtggc 

cgcttcagcc 

taggtaacag 

tttgcttatt 

agagaaatgt 

ttttaacctt 

ggcagatatt 

ttgctttata 

cacacttttt 

agagactatg 

aacattcaca 

atgaagtgct 

atactatcat 

gtagcaactg 

tactataata 

tcaaatagaa 

tctcaacaaa 

ttggttcatg 

tgtctttttt 

actaactctt 

attattggtg 

aaaaagttat 

caaatacggc 

caagcagtgc 

tgccacagcg 

gatagtttgt 

ctgtgtcact 

gtaatcccag 

ccagcctggc 

tgtggtggca 

aatcctgcag 

acaagagcga 

ttgagctttc 

aaatgactct 

accaatgtga 

gtgtgaataa 

gaggaggatt 

ttgagacatg 

caattaacca 

tcttttataa 

acccaccgct 

tgtgagagat 

cagagaactg 

ttgcccagac 

tcaagcaatt 

acctggctaa 

tctcgaaatc 

aggcgcgagc 



52500 

52560 

52620 

52680 

52740 

52800 

52860 

52920 

52980 

53040 

53100 

53160 

53220 

53280 

53340 

53400 

53460 

53520 

53580 

53640 

53700 

53760 

53820 

53880 

53940 

54000 

54060 

54120 

54180 

54240 

54300 

54360 

54420 

54480 

54540 

54600 

54660 

54720 

54780 

54840 

54900 

54960 

55020 

55080 

55140 

55200 

55260 

55320 

55380 

55440 

55500 

55560 

55620 

55680 

55740 

55800 

55860 

55920 

55980 

56040 

56100 
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tallltltl ggcccaatgt atttggattc ttaaagaaca ctttcaaatt aaatatcagt 56160 

tgaagagaac tagaactaaa gaatttctgt gtcaaactgt ttagcaaatg taagtagaag 56220 

ctgggagatg tgtcctggaa tgaatgaata catcagtaaa ataccatacg tatgttatS 56280 

tgttattgtt tcbttgcctt ggttgatttg gttttactgt gaaataatt? tcaatataga IslTo 

attgtgatcg ttggaatttg gtcatctact agaaaatgag aaagaagtta atagctatct 56400 

tccttaaaga tttctgaggt tgggattaag gtagtgttcc caafgtgttc taaaacggca lltlo 

gcgagagctg tgcactcact tcacaaattt gaattcctgc tctg?g?tag gcgctg"^^ 565?6 

<211> 23 

<212> DNA 

<213> Homo sapiens 

<400> 2 

ggtcgtccag cgcttggtag aag 

<210> 3 23 

<211> 5227 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> polyA_signal 
<222> 5180. .5186 
<223> AATAAA 
<400> 3 

ctgctgtccc tggtgctcca cacgtactcc atg cgc tac ctg ctg ccc age gtc 54 

Met Arg Tyr Leu Leu Pro Ser Val 
1 5 

gtg etc ctg ggc acg gcg ccc acc tac gtg ttg gcc tgg ggg gtc tag 102 
Val Leu Leu Gly Thr Ala Pro Thr Tyr Val Leu Ala Trp 5ly Val tS 

egg ctg etc tpc gcc ttc ctg ccc gcc cgc ttc tac caa gcg ctg gac 150 

Arg Leu Leu Ser Ala Phe Leu Pro Ala Arg Phe Tyr Gin 111 Leu Isp 

L ^° 35 40 

m ^^'^ ^^"^ ''^'^ ""^^ Stg etc ttc ttc ttc gag 198 

Asp Arg Leu Tyr Cys Val Tyr Gin Ser Met Val Leu Phe Phe Phe Glu 
45 50 55 

t^^ mu"" ^'"^ ''^^ 99^ ttg cca aaa aat 246 

Asn Tyr Thr Gly Val Gin He Leu Leu Tyr Gly Asp Leu Pro Lys Asn 
60 65 70 

w= ^""^ ""^^ aca gtt gac tgg 294 

Lys Glu Asn He He Tyr Leu Ala Asn His Gin Ser Thr Val Lp Trp 
'5 80 85 

T^l fr^^ f?^ ^^'^ ^^"^ ^""^ cag aat gcg eta gga cat gtg 342 

He Val Ala Asp He Leu Ala He Arg Gin Asn Ala Leu Ily His Va! 
:° 95 100 

l^'^ ?r^^ tta aaa tgg ctg cca ttg tat ggg tqt 390 

Arg Tyr Val Leu Lys Glu Gly Leu Lys Trp Leu Pro Leu Tyr cjs 

t ^-^^ 120 

m,^"^ ^^"^ get cag cat gga gga ate tat gta aag cgc agt gcc aaa ttt 438 

Tyr Phe Ala Gin His Gly Gly He Tyr Val Lys Arg slr Ala Lys Phe 

125 130 135 

aac gag aaa gag atg cga aac aag ttg cag age tac gtg gac gca gga 486 
Asn Glu Lys Glu Met Arg Asn Lys Leu Gin Ser Tyr Val Asp Ila Ily 

140 145 150 

act cea atg tat ett gtg att ttt cca gaa ggt aca agg tat aat cca ^ia 
Thr Pro Met T^r Leu Val He Phe Pro Glu Gly Thr A?g gr Asn Pro 
155 j 160 1S5 

ri? ^""^ ""^^ ^""^ 5='' cag gca ttt get gcc caa cgt 582 

Glu Gin Thr Lp Val Leu Ser Ala Ser Gin Ala Phe Ala Ala Gin Arg 
1/0 i 175 180 

r^Z t1 f?^ ""^^ ""^^ ata aag gca act 630 

Gly Leu Ala Val Leu Lys His Val Leu Thr Pro Arg He LyI ila Thr 

' 190 195 200 
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cac 


gtt 


get 


ttt 


gat 


tgc 


His 


Val 


Ala 


Phe 


Asp 


Cvs 








1 


205 




gtt 


acg 


gtg 


gtt 


tat 


gaa 


Val 


Thr 


Val 


Val 


Tyr 


Glu 








220 






tea 


ccg 


acc 


atg 


acg 


gaa 


Ser 


Pro 


Thr 


Met 


Thr 


Glu 






235 








att 


cac 


att 


aat 


C0t 


ate 


He 


His 


He 


Asp 


Ara 


He 




250 










cat 


atg 


aga 


aaa 




ctg 


His 


Met 


Arg 


Arg 


Trp 


Leu 


265 










270 


ctt 


ata 


gaa 


ttt 


tat 


gag 


Leu 


He 


Glu 


Phe 


Tvr 


Glu 










285 




cct 


ggg 


aaa 


agt 


gtt 


aat 


Pro Gly Lys 


Ser 


Val 


Asn 








300 






tea 


atg 


ttg 


ate 


tta 


agt 


Ser 


Met 


Leu 


lie 


Leu 


Ser 






315 








get 


gga 


agg 


a^g 


ctg 


tat 


Ala Gly Arg 


Lys 


Leu 


Tyr 




330 




i 






ggc 


tgc 


ctg 




gtt 


act 


Gly Cys Leu 


Trp 


Val 


Thr 


345 








350 



210 



225 



240 



255 



290 



305 



320 



tta 


gat 


gca att 


tat 


gat 


678 


uvu 


Asp 


Axa xie 


Tyr 


Asp 




gya 


ggg 


cag cga 


aga 


gag 


726 


Glv 


Glv 




Arg 


PI ft 

ulU 




gaa 


■ tgt 


cca aaa 


att 


cat 


774 


Glu 




Pro Lys 


lie 


His 




gtc 


cca 


gaa gaa 


caa 


gaa 


822 


V dX 


Iri O 

^ o u 


vjfXU VjXU 


tjln 


Glu 




y ctci 


?t h r> 


aaa gat 


aag 


atg 


870 






Lys Asp 


Lys 


Met 




275 








O O A 




gaa 


aga 


aga*' aaa 


aga 


ttt 


918 


Pin 

o±U 


Arg 


Arg Lys 


Arg 
295 


Phe 




ate 


aag 


aag act 


tta 


cca 


966 


He 


Lys 


Lys Thr 
310 


Leu 


Pro 




ggc 


atg 


ctt atg 


acc 


gat 


1014 


Gly 


Met 


Leu Met 


Thr 


Asp 








325 






ata 


tat 


gga aec 


eta 


ctt 


1062 


He 


Tyr 
340 


Gly Thr 


Leu 


Leu 





335 

att aaa gca tag acaagtagct gtctccagac 1112 



agtgggatgt gctacattgt ctatttttgg eggctgcaca tgacatcaaa ttgtttcctg 1172 

aatttattaa ggagtgtaaa taaagccttg ttgattgaag attggataat agaatttgtg 1232 

acgaaagctg atatgcaatg gtcttgggca aacataectg gttgtacaac tttagcatcg 1292 

gggctgetgg aagggtaaaa gctaaatgga gtttctcctg ctctgtccat ttcctatgaa 1352 

etaatgacaa cttgagaagg ctgggaggat tgtgtatttt gcaagtcaga tggctgcatt 1412 

tttgagcatt aatttgcagc gtatttcact ttttctgtta ttttcaattt attacaactt 1472 

gacagctcca agctcttatt actaaagtat ttagtatctt geagctagtt aatattteat 1532 

cttttgctta tttctacaag tcagtgaaat aaattgtatt taggaagtgt caggatgtte 1592 

aaaggaaagg gtaaaaagtg ttcatgggga aaaagctetg tttagcacat gattttattg 1652 

tattgcgtta ttagctgatt ttactcattt tatatttgea aaataaattt ctaatattta 1712 

ttgaaattgc ttaatttgca caccctgtae acacagaaaa tggtataaaa tatgagaacg 1772 

aagtttaaaa ttgtgactct gatteattat ageagaactt taaatttccc agctttttga 1832 

agatttaage tacgctatta gtacttcect ttgtctgtgc cataagtgct tgaaaacgtt 1892 

aaggttttct gttttgtttt gtttttttaa tatcaaaaga gtcggtgtga accttggttg 1952 

gaccccaagt teacaagatt tttaaggtga tgagagcctg cagacattct gcetagattt 2012 

actagcgtgt gccttttgcc tgcttctctt tgatttcaca gaatattcat tcagaagtcg 2072 

cgtttctgta gtgtggtgga ttcccactgg gctctggtcc ttcecttgga tcccgtcagt 2132 

ggtgctgctc agcggcttgc acgtagactt gctaggaaga aatgcagagc cagcctgtgc 2192 

tgcccacttt cagagttgaa ctctttaaag ecettgtgag tgggcttcac cagetactgc 2252 

agaggcattt tgcatttgtc tgtgtcaaga agttcacctt ctcaagccag tgaaatacag 2312 

acttaattcg tc&tgactga acgaatttgt ttatttccca ttaggtttag tggagctaca 2372 

cattaatatg tatcgcctta gagcaagagc tgtgttccag gaaccagatc acgattttta 2432 

gccatggaac aatatatcec atgggagaag acctttcagt gtgaactgtt ctatttttgt 2492 

gttataattt aaacttcgat ttcctcatag tcctttaagt tgaeatttct gcttactgct 2552 

actggatttt tgctgcagaa atatatcagt ggeccacatt aaacatacea gttggatcat 2612 

gataagcaaa atgaaagaaa taatgattaa gggaaaatta agtgactgtg ttacaetget 2672 

tcteecatgc eagagaataa actctttcaa gcatcatctt tgaagagtcg tgtggtgtga 2732 

attggtttgt gtacattaga atgtatgeac acatccatgg acactcagga tatagttggc 2792 

ctaataateg gggcatgggt aaaacttatg aaaatttcet catgctgaat tgtaattttc 2852 

tettacctgt aaagtaaaat ttagatcaat tccatgtctt tgttaagtac agggatttaa 2912 

tatattttga atataatggg tatgttetaa atttgaactt tgagaggcaa tactgttgga 2972 
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attatgtgga ttctaactca 
gttaggtttc atagaactat 
aaaaaaattt tgtatcaaaa 
actgacttga attattttct 
tatcatttaa acttttgttt gtattattac 
tataagaatg ccgtcgatgt 
aaagtaaaaa aaaaaataaa 
ttgaccctaa aattcaccaa 
gcattcttca caatattttt ccctaagctt 
taatcaaaac tgttaagcag tatattagtt 
gatgcatggc cgatgttaat ttgcttggca 
acatgtcaaa tgccttaaat 
ttacaccata catactgata 
agagtggcaa caattttgct 
aattatattc tttgaatagg 
agaaggaaat aaggccaagt 



ttttaacaag gtagcctgac ctgcataaga 
actaatcttc tcacaaaagg tctataaaat 
tgtttggaaa attagaagct tctccttaac 
aaaattaaga gccgtatacc tacctgtaag 
tgatttacag cttagttatt 
atgtttttca gaaaagggtg 
gtctctaatg gctgtgctgt 
gtacataaaa taggcttaat 
tgagcaaagt tttaaaaaaa 
tggttatata aattcatctg 
attctgtaat 
ggtgaataaa 

tgtttcattt ccatgtgatt 
tacataagct ttattttttc 



gcatgctttt 
atctttcact 
cagtctccca 



taactaagtt 
gtttttcata 
taatatgggt 



tcacttgaat 
acagtcgttg 
ctgtattgat 
tcttttcaca 
aatttttctt 
tgtttggatg 
ttaacatttt 
gactggccct 
tacactaaaa 
caatttataa 
cattaagtga tctcagtgaa 
agtgccgatc tggctaactc 
tttaaaattt 
ctttgttcat 



tctgtgtcaa tcaagtgatc taactagact gatcatagat 



tcaagaccag cctggg6aac atatcgagaa cctgtctaca 

aaaaaattaa aaaaaattag ccaggcatgg tggcgtacac tgagtagttt gtcccagcta 

ctcgggaggg tgpggtggga ggatcgcttc agcccaggag gttgagattg cagtgagcca 

tggacatacc actgcactac agcctaggta acagcacgag accccaactc ttagaaaatg 

aaaaggaaat atagaaatat aaaatttgct tattatagac acacagtaac tcccagatat 

gtaccacaaa aaatgtgaaa agagagagaa atgtctacca aagcagtatt ttgtgtgtat 

ataattttaa ccttaatttg tttttagtag tgtttagatt 

tcttggcaga tattccgtat ctggtggaaa gctacaatgc 

tggcttgctt tataaacaag attttttctc cctccttttg 

aactcacact ttttgattaa agaacttgaa attacgttat 

atatagagac tatgtaacat gcaatcatta gaatcaaaat 

ttacaacatt cacatacttg tcaaatattc atgtaattaa 

tattatgaag tgctcgtctg tacaatcgct aatttactca 

ttcgatacta tcatcaatat ttgacatctt ttccaatttg 

tcctgtagca actggggagt catatatgag gtcaaagaca 

tgtatactat aataatagct ggttatcctg agcaggggaa 

cacttcaaat agaaagctga agtacttcta atatactgag 

aaactctcaa caaaatgttt attgatgttg atgaaacaga 

attattggtt catgatttta tatgtgaata tgtaagatat 

_ ttcatgtctt tttttaaaaa aggtgctatc gaaattctgt 

gtctccagca ggcaagaata cttgactaac tctttttgtc tctttatggt attttcagaa 

taaagtctga cttgtgtttt tgagattatt ggtgcctcat taattcagca ataaaggaaa 
atatgcattt caaaaanaaa aaaaaaaaaa aaaaa 
<210> 4 

353 
PRT 

Homo sapiens 



aattgcaagc gcatagtaaa 
gaagattgag tgaaatattt 
aatgtcgttg ta^ttttgca 
ggccagtttt cattacgagt 
cacttagtat aattgacatt 
tagtactttg gtcaaaatat 
ctgaatttaa aaccttcaac 
gtttagagta gctacaactc 
tgtatgaaaa gtaaatctat 
tataccttgt tattataata 
aaggttattt ttaggaaaac 
ggaagtataa tatgtggaac 
tcagtttttc catccggatt 
gttctgcaat tttataaatg 



<211> 
<212> 
<213> 
<220> 
<221> 
<222> 
<223> 
<221> 
<222> 
<223> 
<221> 
<222> 
<223> 



HELIX 
1..33 
Rao and 
HELIX 
4. .20 

Klein, ■ Kanehisa 
HELIX 
4. .24 

Eisenberg, Schwarz, 
potential helix 
<221> MYRISTATE 
<222> 12. .16 
<223> Prosite match 
<221> HELIX 
<222> 50. .70 

<223> Eisenberg, Schwarz, 
potential helix 
<221> CARBOHYD 



3032 
3092 
3152 
3212 
3272 
3332 
3392 
3452 
3512 
3572 
3632 
3692 
3752 
3812 
3872 
3932 
3992 
4052 
4112 
4172 
4232 
4292 
4352 
4412 
4472 
4532 
4592 
4652 
4712 
4772 
4832 
4892 
4952 
5012 
5072 
5132 
5192 
5227 



Argos identification method, potential helix 



and DeLisi identification method, potential hel 



Komarony, Wall identification method. 



Komarony, Wall identification method, 
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<222> 57. .59 

<223> Prosite match 

<221> HELIX 

<222> 76.-96 

<223> Eisenberg, Schwarz, Komarony, Wall identification method, 
potential helix 
<221> PHOSPHORYLATION 
<222> 78 

<223> potential Tyrosine kinase site, Prosite match 
<221> PHOSPHORYLATION 
<222> 84 

<223> potenti^ caseine kinase II site, Prosite match 
<221> SITE ; 
<222> 94. .1151 

<223> potential Leucine zipper site, Prosite match 
<221> MYRI STATE 
<222> 119. .12? 

<223> potential site, Prosite match 
<221> PHOSPHORYLATION 
<222> 133 

<223> potential protein kinase C, Prosite match 
<221> PHOSPHORYLATION 
<222> 147 

<223> potential caseine kinase II site, Prosite match 
<221> PHOSPHORYLATION 
<222> 194 

<223> potential protein kinase C, Prosite match 
<221> PHOSPHORYLATION 
<222> 215 

<223> potential Tyrosine kinase site, Prosite match 
<221> SULFATATION 
<222> 221 

<223> Prosite match 
<221> PHOSPHORYLATION 
<222> 233 

match ^"^^""^^^^ ""^^ ^^^P dependant protein kinase site, Prosite 

<221> PHOSPHORYLATION 
<222> 235 I 

<223> potential caseine kinase II site, Prosite match 
<221> PHOSPHORYLATION 
<222> 306 ; 

<223> potentikl protein kinase C, Prosite match 
<221> HELIX 
<222> 310. .33b 

<223> Eiser^erg, Schwarz, Komarony, Wall identification method, 

potential helix 

<221> MYRISTATE 

<222> 319. .323 

<223> Prosite match 

<221> MYRISTATE 

<222> 323. .327 

<223> Prosite match 

<221> AMIDATION 

<222> 329 

<223> Prosite match 
<221> HELIX 
<222> 333. .353 

<223> Eisenberg, Schwarz, Komarony, Wall identification method, 
potential helix 
<221> MYRISTATE 
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<222> 341. ,345 
<223> Prosite match 
<221> PHOSPHORYLATION 
<222> 350 

<223> potential protein kinase C, Prosite match 
<400> 4 ; 

Met Arg Tyr Lbu Leu Pro Ser Val Val Leu Leu Gly Thr Ala Pro Thr 

^ ; 5 10 15 

Tyr Val Leu Ap.a Trp Gly Val Trp Arg Leu Leu Ser Ala Phe Leu Pro 

20 25 30 

Ala Arg Phe Tyr Gin Ala Leu Asp Asp Arg Leu Tyr Cys Val Tyr Gin 

35 40 45 

Ser Met Val Leu Phe Phe Phe Glu Asn Tyr Thr Gly Val Gin He Leu 

50 55 60 

Leu Tyr Gly Asp Leu Pro Lys Asn Lys Glu Asn He He Tyr Leu Ala 

70 75 8^ 

Asn Hxs Gin Ser Thr Val Asp Trp He Val Ala Asp He Leu Ala He 

85 90 95 

Arg Gin Asn Ala Leu Gly His Val Arg Tyr Val Leu Lys Glu Gly Leu 

100 105 110 

Lys Trp Leu Pro Leu Tyr Gly Cys Tyr Phe Ala Gin His Gly Glv He 

115 120 125 

Tyr Val Lys Arg Ser Ala Lys Phe Asn Glu Lys Glu Met Arg Asn Lys 

130 135 140 

Leu Gin Ser Tyr Val Asp Ala Gly Thr Pro Met Tyr Leu Val He Phe 

150 155 160 

Pro Glu Gly Thr Arg Tyr Asn Pro Glu Gin Thr Lys Val Leu Ser Ala 

165 170 175 

Ser Gin Ala Phe Ala Ala Gin Arg Gly Leu Ala Val Leu Lys His Val 

180 185 190 

Leu Thr Pro Arg He Lys Ala Thr His Val Ala Phe Asp Cys Met Lys 

195 : 200 205 

Asn Tyr Leu Asp Ala He Tyr Asp Val Thr Val Val Tyr Glu Gly Lys 

210 215 220 

Asp Asp Gly Gly Gin Arg Arg Glu Ser Pro Thr Met Thr Glu Phe Leu 
225 230 235 240 

Cys Lys Glu Cys Pro Lys He His He His He Asp Arg He Asp Lys 

. 245 250 255 

Lys Asp Val Pro Glu Glu Gin Glu His Met Arg Arg Trp Leu His Glu 

260 265 270 

Arg Phe Glu He Lys Asp Lys Met Leu He Glu Phe Tyr Glu Ser Pro 
275 280 285 

^^"^ ^"^^ Lys Ser Val Asn Ser Lys 

^ 290 295 300 

Leu Ser He Lys Lys Thr Leu Pro Ser Met Leu He Leu Ser Gly Leu 
305 310 315 ^ 320 

Thr Ala Gly Met Leu Met Thr Asp Ala Gly Arg Lys Leu Tyr Val Asn 

325 330 335 

Thr Trp He Tyr Gly Thr Leu Leu Gly Cys Leu Trp Val Thr He Lys 
340 345 350 

Ala 

<210> 5 

<211> 364 

<212> PRT 

<213> Homo sapiens 

<400> 5 

Met Leu Leu S^r Leu Val Leu His Thr Tyr Ser Met Arg Tyr Leu Leu 

5 10 15 

Pro Ser Val Vpl Leu Leu Gly Thr Ala Pro Thr Tyr Val Leu Ala Trn 

20 25 30 

Gly Val Trp Arg Leu Leu Ser Ala Phe Leu Pro Ala Arg Phe Tyr Gin 
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35 J 40 45 

Ala Leu Asp App Arg Leu Tyr Cys Val Tyr Gin Ser Met Val Leu Phe 
50 : 55 gQ 

Phe Phe Glu A5n Tyr Thr Gly Val Gin He Leu Leu Tyr Gly Asp Leu 

r ' '^^ 75 80 

Pro Lys Asn Lys Glu Asn He He Tyr Leu Ala Asn His Gin Ser Thr 

85 90 95 

Val Asp Trp He Val Ala Asp He Leu Ala He Arg Gin Asn Ala Leu 

100 105 110 

Gly His Val Arg Tyr Val Leu Lys Glu Gly Leu Lys Trp Leu Pro Leu 

115 120 125 

Tyr Gly Cys Tyr Phe Ala Gin His Gly Gly He Tyr Val Lys Arg Ser 
130 135 3^4Q 

Ala Lys Phe Asn Glu Lys Glu Met Arg Asn Lys Leu Gin Ser Tyr Val 
- 155 160 

Asp Ala Gly Thr Pro Met Tyr Leu Val He Phe Pro-'Glu Gly Thr Ara 

165 170 175 

Tyr Asn Pro Glu Gin Thr Lys Val Leu Ser Ala Ser Gin Ala Phe Ala 

180 185 190 

Ala Gin Arg Gly Leu Ala Val Leu Lys His Val Leu Thr Pro Arg He 

195 200 205 

Lys Ala . Thr His Val Ala Phe Asp Cys Met Lys Asn Tyr Leu Asp Ala 

210. 215 220 

He Tyr Asp Val Thr Val Val Tyr Glu Gly Lys Asp Asp Gly Gly Gin 
225 230 235 240 

Arg Arg Glu Ser Pro Thr Met Thr Glu Phe Leu Cys Lys Glu Cys Pro 

i 245 250 255 

Lys He His lie His He Asp Arg He Asp Lys Lys Asp Val Pro Glu 

260 265 270 

Glu Gin Glu His Met Arg Arg Trp Leu His Glu Arg Phe Glu He Lvs 

275 i 280 285 

Asp Lys Met Leu He Glu Phe Tyr Glu Ser Pro Asp Pro Glu Arg Arg 

290 295 300 

Lys Arg Phe Pro Gly Lys Ser Val Asn Ser Lys Leu Ser He Lys Lys 

310 315 
Thr Leu Pro Ser Met Leu He Leu Ser Gly Leu Thr Ala Gly Met Leu 

325 330 335 

Met Thr Asp Ala Gly Arg Lys Leu Tyr Val Asn Thr Trp He Tyr Glv 

340 345 350 

Thr Leu Leu Gly Cys Leu Trp Val Thr He Lys Ala 
355 360 

<210> 6 
<211> 26 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> misc_binding 
<222> 1. .26 

<223> primer oligonucleotide GCl.Sp.l 
<400> 6 

ctgtccctgg tgctccacac gtactc 
<210> 7 
<211> 26 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> misc_biiiding 
<222> 1. .26 

<223> primer oligonucleotide GC1.5p.2 
<400> 7 

tggtgctcca cacgtactcc atgcgc 



26 



26 
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<210> 8 

<211> 27 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> misc_binding 
<222> 1, .27 

<223> primer oligonucleotide pgl5RACE196 
<400> 8 

caatatctgg accccggtgt aattctc ^7 

<210> 9 

<211> 34 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> misb^binding 
<222> 1. .34 

<223> primer oligonucleotide GCl,3p 
<400> 9 

cttgcctgct ggagacacag aatttcgata gcac o/i 

<210> 10 ''^ 

<211> 24 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> misc_bihding 
<222> 1. .24 

<223> primer oligonucleotide PGRT32 
<400> 10 

tttttttttt tttttttttg aaat 

<210> 11 

<211> 6 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> SITE 
<222> 160. .165 

nc^^rn^^^'L.^™ SEQID4, present in AF003136, P33333, P26647, U89336, 
Ui>b417, AB005623. 
<400> 11 

Phe Pro Glu Gly Thr Arg 

1 5 

<210> 12 

<211> 6 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> SITE 

<222> 129 . .134 

<223> box2 from Z72511 

<400> 12 

Phe Pro Glu Gly Thr Asp 
1 5 
<210> 13 
<211> 6 
<212> PRT 

<213> Homo sapiens . 
<220> 

<221> SITE 
<222> 223. .228 

<223> box2 from P38226, Z49770 
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<400> 13 

Phe Pro Glu Gly Thr Asn 
1 5 
<210> 14 
<211> 6 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> 90. .95 

<223> box2 from Z49860 and Z29518 
<400> 14 

Phe Val Glu Gly Thr Arg 
1 ' 5 

<210> 15 
<211> 9 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> 211. .219 

<223> box3 from SEQID4, present in AF003135 
<400> 15 

Leu Asp Ala lie Tyr Asp Val Thr Val 
1 5 
<210> 16 
<211> 9 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> SITE 

<222> 204. .212 

<223> box3 from Z72511 

<400> 16 

Val Glu Tyr lie Tyr Asp lie Thr lie 

1 5 

<210> 17 

<211> 9 

<212> PRT 

<213> Homo sapiens 

<220> : 

<221> SITE 

<222> 271. .279 

<223> box3 from P3a226 

<400> 17 

lie Glu Ser Leu Tyr Asp He Thr He 
1 5 
<210> 18 
<211> 9 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> SITE 

<222> 265. .273 

<223> box3 from Z49770 

<400> 18 

Leu Asp Ala He Tyr Asp Val Thr He 
1 5 
<210> 19 
<211> 9 
<212> PRT 



wo 99/32644 



PCT/IB98/02133 



26 



<213> Homo sapiens 
<220> 

<221> SITE 

<222> 138. .146 

<223> box3 froniZ49860 

<400> 19 

Val Pro Ala lie Tyr Asp Met Thr Val 

1 : 5 

<210> 20 

<211> 9 

<212> PRT . 

<213> Homo sapiens 

<220> 

<221> SITE 

<222> 218. .22^ 

<223> box3 from Z29518 

<400> 20 

Val Pro Ala He Tyr Asp Thr Thr Val 

1 5 

<210> 21 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1, .47 

<223> polymorphic fragment 99-123 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-123. misl 
<221> primer_bind 
<222> 25. .47 ; 

<223> complement potential microsequencing oligo 99-123 .mis2 
<400> 21 

tttctcatcc tcacacctca ctgcgcccct cctgaaccca ctccttt 47 

<210> 22 = 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-26 
<221> allele 
<222> 24 

<223> polymorphic base G 
<221> primer^bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-26. misl 
<221> primer__bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-26 mis2 
<400> 22 

ccctgtnaga cacgtcctgt atcgttgttg agatgggaaa gtgcatc 47 

<210> 23 

<211> 47 

<212> DNA 

<213> Homo Sapiens 
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<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-14 
<221> allele ! 
<222> 24 

<223> polymorphic base T 
<221> primer__bind 
<222> 1. .23 ; 

<223> potential microsequencing oligo 4-14. misl 
<221> primer_bind 
<222> 25. .47 

<400> 23"^^"^^^^*^ potential microsequencing oligo 4-14,inis2 
gcagggagca gaccagacat gatttgttct agtctagctg attcata 



<210> 
<211> 



24 
47 



47 



<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<22.2> 1 . . 47 

<223> polymorphic fragment 4-77, extracted from SEQ IDI 12057 12103 
<Z2±> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1, .23 I 

<223> potentikl microsequencing oligo 4-77. misl 
<221> primer_bind 
<222> 25. .47 ! 

<223> complement potential microsequencing oligo 4-77. mis2 
^400^ 24 ! 

gctgttcaga ctaaacttgg agactacagt cagtcagaga acttgct 47 
<210> 25 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<221> alSe"^^^"" fragment 99-217. extracted from SEQ IDI 34469 34515 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-217. misl 
<221> primer^bind 
<222> 25. .47 

^Ann^ complement potential microsequencing oligo 99-217 .mis2 

atatagttca cgfctatgttc atacttaatt gttgcatttt gtttgcc 47 

<2 1 0>' 26 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-67, extracted from SEQ IDI 51612 51658 
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<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer.bind 
<222> 1. .23 

<223> potential mi cr ©sequencing oligo 4-67 .misl 
<221> primer^bind 
<222> 25. .47 

<223> complement potential raicrosequencing oligo 4-67 mis2 
<400> 26 

gccagtgaaa tacagactta attcgtcatg actgaacgaa tttgttt a7 

<210> 27 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-213 
<221> allele 
<222> 24 

<223>. polymorphic base T 
<221> primer_bind 
<222> 1. .23 

<223> potential microseguencing oligo 99-213 .misl 
<221> primer_bind 
<222> 25. .47 . 

'^223> complement potential microsequencing oligo 99-213 .mis2 

ccttagcatt caagcccctg agctctggtg ttgtccaccc ctggggg ah 

<210> 28 33333 ^/ . 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-221 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-221. misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-221. mis2 
^400^ 28: 

agcttgagaa acpagaaaag ccaaaaggag gctcctacca catgggt An 

<210> 29 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-135 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
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<222> 1. .23 

<223> potential microsequencing oligo 99-135. misl 
<221> primerjDind 
<222> 25, .47 

<223> complement potential microsequencing oligo 99-13 5 mis2 
<400> 29 

agtcactata tctatgttta atgaagatag aaagagatgc agaaatg 47 
<210> 30 y a ^/ 

<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-123, variant version of SEO ID21 
<221> allele ' 
<222> 24 

<223> base T ; C in SEQ ID21 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-123. misl 
<221> primer_bind 
.<222> 25. .47 ' 

<223> complement potential microsequencing oligo 99-123 mis2 
<400> 30 

tttctcatcc tcacacctca ctgtgcccct cctgaaccca ctccttt a7 
<210> 31 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-26, variant version of SEO ID22 
<221> allele 
<222> 24 

<223> base A ; G in SEQ ID22 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-26. misl 
<221> primer_bind 
<222> 25.. 47 ! 

<223> complement potential microsequencing oligo 4-26. mis2 
<400> 31 ! 

ccctgtnaga capgtcctgt atcattgttg agatgggaaa gtgcatc 47 
<210> 32 ' 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-14, variant version of SEQ ID23 
<221> allele 
<222> 24 

<223> base C ; T in SEQ ID23 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-14. misl 
<221> primer_bind 
<222> 25. .47 
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<223> complement potential microsec[uencing oligo 4-14. mis2 
<400> 32 

gcagggagca gaccagacat gatctgttct agtctagctg attcata 47 
<210> 33 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-77, variant version of SEQ ID24 
<221> allele ■ 
<222> 24 

<223> base G ) C in SEQ ID24 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-77. misl 
.<221> primer_bind 
. <222>25..47 

<223> complement potential microsequencing oligo 4-77. mis2 
<400> 33 

gctgttcaga ctaaacttgg agagtacagt cagtcagaga acttgct 47 

<210> 34 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 99-217, variant version of SEQ ID25 
<221> allele 
<222> 24 

<223> base T ; C in SEQ ID25 
<221> primer_bind 
<222> 1. .23 I 

<223> potential microsequencing oligo 99-217 .misl 
<221> prime r_iDind 
<222> 25. .47 ; 

<223> complemfent potential microsequencing oligo 99-217. mis2 
<400> 34 

atatagttca cgttatgttc atatttaatt gttgcatttt gtttgcc 47 

<210> 35 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-67, variant version of SEQ ID2 6 
<221> allele 
<222> 24 

<223> base T ; C in SEQ ID26 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-67. misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-67. mis2 
<400> 35 

gccagtgaaa tapagactta atttgtcatg actgaacgaa tttgttt 47 
<210> 36 
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<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<221> alSr^^^^ fragment 99-213, variant version of SEQ ID27 
<222> 24 

<223> base C ; T in SEQ ID27 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-213. misl 
<221> primer_bind 
<222> 25. .47 

^IIV^ complement potential microsequencing oligo 99-213 .mis2 

<210>^37^^ caagcccctg agccctggtg ttgtccaccc ctggggg 47 

<211> 47 
<212> DNA 

<213> Homo Sapiens 

<220> " , . 

<221> allele 
<222> 1. .47 

^00?^ P?^^^^^^^ fragment 99-221, variant version of SEQ ID28 
<^zi> allele 

<222> 24 

<223> base C ; A in SEQ ID28 
<221> primer_bind 
<222> 1. ,23 I 

<223> potential microsequencing oligo 99-221. misl 
<221> primer„bind 
<222> 25 . . 47 1 

<400> 37"'^'^^?'^*' potential microsequencing oligo 99-221. mis2 

^?in^^?f^^ ^^^^^^^^^5 ccacaaggag gctcctacca catgggt 47 

<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1, .47 

<221> alSr^^^"" fragment 99-135, variant version of SEQ ID29 
<222> 24 

<223> base G ; A in SEQ ID29 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-135 misl 
<221> primer_bind 
<222> 25. .47 

<400> 3g'^^''-®"^p^*^ potential microsequencing oligo 99-135. mis2 

<210>^39^^ tctatgttta atggagatag aaagagatgc agaaatg 47 

<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> 
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<221> prime r_bind 
<222> 1..18 I 

<223> upstream amplification primer 99-123-PU 
<400> 39 i 

aaagccagga ctagaagg 18 
<210> 40 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. .18 

<223> upstream amplification primer 4-26-PU 
<400> 40 

tacagccctg taagacac 18 
<210> 41 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 

<122> 1. .18 . . ; 

<223> upstream amplification primer 4-14-PU 
<400> 41 

tctaacctct catccaac 18 
<210> 42 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1, .18 

<223> upstream amplification primer 4-77-PU, extracted from SEQ IDl 11930 

11947 

<400> 42 

tgttgattta caggcggc 18 
<210> 43 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1..19 

<223> upstream amplification primer 99-217-PU, extracted from SEQ IDl 34216 

34234 

<400> 43 

ggtgggaatt tactatatg 19 

<210> 44 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .18 : 

<223> upstream amplification primer 4-67-PU, extracted from SEQ IDl 51596 

51613 

<400> 44 

aagttcacct tctcaagc 18 
<210> 45 
<211> 20 
<212> DNA 
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<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. .20 

<223> upstream amplification primer 99-213-PU 
<400> 45 

atactggcag cgtgtgcttc 
<210> 46 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. .19 

<223> upstream amplification primer 99-221-PU 
<400> 46 ^■ 
ccctttttct tcactgttc 
<210> 47 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<22p> 

<221> primer_bind 
<222> 1..18 

<223> upstream amplification primer 99-135-PU 
<400> 47 

tggaagttgt tattgccc 
<210> 48 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer__bind 
<222> 1. .18 

<223> downstream amplification primer 99-123 -RP 
<400> 48 

tattcagaaa ggagtggg 
<210> 49 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1, .18 I 

<223> downstream amplification primer 4-26-RP 

<400> 49 : 

tgaggactgc taggaaag 

<210> 50 

<211> 20 

<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer^bind 
<222> l.,20 

<223> downstream amplification primer 4-14-RP 
<400> 50 

gactgtatcc tttgatgcac 
<210> 51 
<211> 20 
<212> DNA 

<213> Homo Sapiens 



wo 99/32644 



34 



PCT/IB98/02133 



<220> 

<221> primer_bind 
<222> 1. .20 

<223> downstream amplification primer 4-77-RP, extracted from SEQ IDl 12339 
123 

58 complement 
<400> 51 

ggaaaggtac tcattcatag 20 

<210> 52 i 

<211> 21 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..21 ; 

<223> downstream amplification primer 99-23f7-RP, extracted from SEQ IDl 34625 
34645 coitplement 
<400> 52 . 

gtttattttg tgtgagcttt g 21 

<210> 53 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..20 

<223> downstream amplification primer 4-67-RP, extracted from SEQ IDl 51996 
520 

15 complement 
<400> 53 

tgaaagagtt tattctctgg 20 
<210> 54 
<211> 21 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer„bind 
<222> 1..21 ' 

<223> downstream amplification primer 99-213 -RP 
<400>'54 

ttattgcccc ac^itgcttga g 21 

<210> 55 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer„bind 
<222> 1. ,19 

<223> downstream amplification primer 99-22 1-RP 
<400> 55 

tcattcgtct ggctaggtc 19 

<210> 56 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .18 

<223> downstream an^lif ication primer 99-13 5-RP 
<400> 56 ; 

aaacacctcc cattgtgc 18 
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<210> 57 

<211> 47 

<212> DNA 

<213> Homo Sat)iens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-1482 
<221> allele ^ 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1, .23 

<223> potential microsequencing oligo 99-1482. misl 

<221> primer_bind 

<222> 25..47 '* 

<223> complement potential microsequencing oligo 99-1482 mis2 
<400> 57 

agtgaagtct gagggggaaa aatcaaccct atagagggaa ggatctq An 
<210> 58 
<211> 47 
. <a.i2> DNA 
<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-73, extracted from SED IDl 13657 13703 
<221> allele 
<222> 24 

<223> polymorphic base C in PGl (13680) SEQ IDl 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-73 .misl 
<221> prime r_bind 
<222> 25. .47 \ 

<223> complement potential microsequencing oligo 4-73 mis2 
<400> 58 

gttttcctta tgatgttaca tggcttattt ttaaaggtaa tgaaaac 47 

<210> 59 

<211> 4'7 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1, .47 

^lll^ polymorphic fragment 4-65, extracted from SEQ IDl 51448 51494 
<221> allele 
<222> 24 

<223> polymorphic base T in PGl (51471) SEQ IDl 
<221> primer_bind 
<222> 1 , .23 

<223> potential microsequencing oligo 4-65. misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-65. mis2 
<400> 59 

ggtgctgctc agcggcttgc acgtagactt gctaggaaga aatgcag 47 

<2 1 0> 6 0 

<211> 47 

<212> DNA 

<213> Homo Sapiens 
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<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-1482, variant version of SEQ ID57 
<221> allele 
<222> 24 

<223> base A ; C in SEQ ID57 
<221> primer„bind 
<222> 1. .23 

<223> potential micros equencing oligo 99-1482. raisl 
<221> primer„bind 
<222> 25. .47 

<223> complement potential micros equencing oligo 99-1482 mis2 
<400> 60 

agtgaagtct gagggggaaa aataaaccct atagagggaa ggatctg 47 
<210> 61 ^. 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 

<222> 1; .47 ■ . . . . 

<223> polymorphic fragment 4-73, variant version of SEQ * IDS 8 
<221> allele 
<222> 24 

<223> base G ; C in SEQ ID58 
<221> prime r_bind 
<222> 1. .23 

<223> potential micros equencing oligo 4-73. misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-73. mis2 
<400> 61 

gttttcctta tgatgttaca tgggttattt ttaaaggtaa tgaaaac 47 
<210> 62 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-65, variant version of SEQ ID59 
<221> allele 
<222> 24 

<223> base C ; T in SEQ ID59 
<221> primer_bind 
<222> 1. ,23 : 

<223> potential microsequencing oligo 4-65. misl 
<221> primer_bind 
<222> 25 . .47 \ 

<223> complement potential microsequencing oligo 4-65.rais2 
<400> 62 i 

ggtgctgctc agcggcttgc acgcagactt gctaggaaga aatgcag 47 

<210>' 63 ' 

<211> 21 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .21 

<223> upstream amplification primer 99-1482-PU 
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<400> 63 

atcaaatcag tgaagtctga g 

<210> 64 21 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .18 

13564 '^^^^''^^'^ amplification primer 4-73-PU, extracted from SEQ IDl 13547 
<400> 64 

atcgctggaa cattctgg 

<210> 65 

<211> 20 

<212> DNA 

<213> Homo Sat>iens 

<220> 

<221> primer_bind 
<222> 1..20 

51168 "^^^^""^^ amplification primer 4-65-PU, extracted from SEQ IDl 51149 
<400> 65 ' • 

gatttaagct acgctattag 

<210> 66 20 
<211> 20 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer„bind 
<222> 1. ,20 

<223> downstream amplification primer 99-1482-RP 
<400> 66 

acaaatctat ataaggctgg 

<210> 67 20 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> pfimer_bind 
<222> 1. .20 

^398^ coiSfS^S amplification primer 4-73-RP, extracted from SEQ IDl 13962 

<400> 67 

ctcttggtta aacagcagtg 

<210> 68 20 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer„bind 
<222> 1. .18 

tllll co^lemeS ^^^^^^^^^^^^ 4-65-RP, extracted from SEQ IDl 51482 

<400> 68 

tggctctgca tttcttcc 

<210> 69 

<211> 5226 

<212> DNA 

<213> Homo sapiens 

<400> 69 
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ctgctgtccc tggtgctcca cacgtactcc atg cgc tac ctg ctg ccc age gtc 54 

Met Arg Tyr Leu Leu Pro Ser Val 
1 5 

gtg etc ctg ggc acg gcg ccc acc tac gtg ttg gcc tgg ggg gtc tgg 102 
Val Leu Leu Gly Thr Ala Pro Thr Tyr Val Leu Ala Trp Gly Val Trp 

10 15 20 

egg ctg etc tec gcc ttc ctg ccc gcc cgc ttc tac caa gcg ctg gae 150 
Arg Leu Leu Ser Ala Phe Leu Pro Ala' Arg Phe Tyr Gin Ala Leu Asp 
25 ; 30 35 40 

gac egg etc tac tgc gtc tac cag age atg gtg etc ttc ttc ttc gag 198 
Asp Arg Leu Tj/r Cys Val Tyr Gin Ser Met Val Leu Phe Phe Phe Glu 

i 45 50 55 

aat tac acc ggg gtc cag ata ttg eta tat gga gat ttg cca aaa aat 246 
Asn Tyr Thr Gly Val Gin lie Leu Leu Tyr Gly Asp Leu Pro Lys Asn 

60 65 70 

aaa gaa aat ata ata tat tta gea aat cat caa agc^aca gtt gac tgg 294 
Lys Glu Asn He lie Tyr Leu Ala Asn His Gin Ser Thr Val Asp Trp 

75 80 85 

att gtt get gac ate ttg gee ate agg cag aat gcg eta gga cat gtg 342 
He Val Ala Asp He Leu Ala He Arg Gin Asn Ala Leu Gly His Val 

90 95 100 

egc tac gtg ctg aaa gaa ggg tta aaa' tgg ctg, cea ttg tat ggg tgt .:390 
Arg Tyr Val Leu Lys Glu Gly Leu Lys Trp Leu Pro Leu Tyr Gly Cys 
105 110 115 120 

tac ttt get cag cat gga gga ate tat gta aag egc agt gcc aaa ttt 43 8 

Tyr Phe Ala Gin His Gly Gly He Tyr Val Lys Arg Ser Ala Lys Phe 

125 130 135 

aac gag aaa gag atg cga aae aag ttg cag age tac gtg gac gea gga 486 
Asn Glu Lys Glu Met Arg Asn Lys Leu Gin Ser Tyr Val Asp Ala Gly 

140 145 150 

act cca atg tat ett gtg att ttt cca gaa ggt aca agg tat aat cca 534 
Thr Pro Met Tyr Leu Val He Phe Pro Glu Gly Thr Arg Tyr Asn Pro 

155 160 165 

gag caa aca aaa gtc ett tea get agt cag gea ttt get gee caa cgt 582 
Glu Gin Thr Lys Val Leu Ser Ala Ser Gin Ala Phe Ala Ala Gin Arg 

170 I 175 180 

ggc ett gea gta tta aaa eat gtg eta aca cca cga ata aag gea act 63 0 

Gly Leu Ala Val Leu Lys His Val Leu Thr Pro Arg He Lys Ala Thr 
185 190 195 200 

cac gtt get ttt gat tgc atg aag aat tat tta gat gea att tat gat 678 
His Val Ala Phe Asp Cys Met Lys Asn Tyr Leu Asp Ala He Tyr Asp 

205 210 215 

gtt acg gtg gtt tat gaa ggg aaa gac gat gga ggg tag cgaagagagt 727 
Val Thr Val Val Tyr Glu Gly Lys Asp Asp Gly Gly * 

220 225 
eaccgaccat gaeggaattt ctctgcaaag aatgtccaaa aattcatatt cacattgate 787 
gtatcgacaa aaaagatgte ecagaagaac aagaacatat gagaagatgg ctgeatgaac 847 
gtttcgaaat caaagataag atgettatag aattttatga gtcaccagat ecagaaagaa 907 
gaaaaagatt teetgggaaa agtgttaatt ecaaattaag tatcaagaag actttaecat 967 
caatgttgat cttaagtggt ttgactgcag geatgettat gaeegatgct ggaaggaage 1027 
tgtatgtgaa cacetggata tatggaaeec tacttggctg cctgtgggtt actattaaag 1087 
catagacaag tagctgtcte cagacagtgg gatgtgctac attgtctatt tttggcgget 1147 
gcacatgaea teaaattgtt tectgaattt attaaggagt gtaaataaag ecttgttgat 1207 
tgaagattgg ataatagaat ttgtgacgaa agetgatatg caatggtctt gggeaaacat 1267 
acctggttgt acaaetttag catcgggget getggaaggg taaaagctaa atggagtttc 1327 
tectgetctg tccatttect atgaactaat gacaacttga gaaggctggg aggattgtgt 1387 
attttgeaag teagatgget geatttttga geattaattt geagegtatt teactttttc 1447 
tgttattttc aatttattae aacttgacag ctecaagcte ttattactaa agtatttagt 1507 
atcttgcagc tagttaatat ttcatctttt gettatttet acaagtcagt gaaataaatt 1567 
gtatttagga agjbgtcagga tgttcaaagg aaagggtaaa aagtgttcat ggggaaaaag 1627 
etctgtttag cafcatgattt tattgtattg cgttattage tgattttact cattttatat 1687 
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ttgcaaaata 

gaaaatggta 

aactttaaat 

tgtgccataa 

aaagagtcgg 

gcctgcagac 

tcacagaata 

ggtccttccc 

gaagaaatgc 

tgagtgggct 

ccttctcaag 

cccattaggt 

ccaggaacca 

cagtgtgaac 

aagttgacat 

cattaaacat 

attaagtgac 

tctttgaaga 

atggacactc 

tcctcatgct 

tctttgttaa 

actttgagag 

tgacctgcat 

aaggtctata 

agcttctcct 

tacctacctg 

acagcttagt 

ttcagaaaag 

aatggctgtg 

aaaataggct 

aagttttaaa 

tataaattca 

taatcattaa 

taaaagtgcc 

atttccatgt 

agctttattt 

gatctaacta 

caacatatcg 

acactgagta 

ggaggttgag 

cgagacccca 

agacacacag 

accaaagcag 

tttgttttta 

gtatctggtg 

caagattttt 

ttaaagaact 

acatgcaatc 

cttgtcaaat 

tctgtacaat 

atatttgaca 

gagtcatata 

agctggttat 

ctgaagtact 

gtttattgat 

tttatatgtg 

aaaaaggtgc 

tgtctcttta 

tcattaattc 

<210> 70 

<211> 228 



aatttctaat 
taaaatatga 
ttcccagctt 
gt^cttgaaa 
tgtgaacctt 
attctgccta 
ttcattcaga 
ttggatcccg 
agagccagcc 
tcaccagcta 
ccagtgaaat 
ttagtggagc 
gatcacgatt 
tgttctattt 
ttctgcttac 
accagttgga' 
tgtgttacac 
gtcgtgtggt 
aggatatagt 
gaattgtaat 
gtacagggat 
gcaatactgt 
aagatcactt 
aaatacagtc 
taacctgtat 
taagtctttt 
tattaatttt 
ggtgtgtttg 
ctgtttaaca 
taatgactgg 
aaaatacact 
tctgcaattt 
gtgatctcag 
gatctggcta 
gatttttaaa 
tttcctttgt 
gaqtgatcat 
agaacctgtc 
gtttgtccca 
attgcagtga 
actcttagaa 
taactcccag 
tattttgtgt 
gtagtgttta 
gaaagctaca 
tctccctcct 
tgaaattacg 
attagaatca 
attcatgtaa 
cgctaattta 
tcttttccaa 
tgaggtcaaa 
cctgagcagg 
tctaatatac 
gttgatgaaa 
aatatgtaag 
tatcgaaatt 
tggtattttc 
agcaataaag 



atttattgaa 
gaacgaagtt 
tttgaagatt 
acgttaaggt 
ggttggaccc 
gatttactag 
agtcgcgttt 
tcagtggtgc 
tgtgctgccc 
ctgcagaggc 
acagacttaa 
tacacattaa 
tttagccatg 
ttgtgttata 
tgctactgga 
tcatgataag 
tgcttctccc 
gtgaattggt 
tggcctaata 
tttctcttac 
ttaatatatt 
tggaat.tatg 
gaatgttagg 
gttgaaaaaa 
tgatactgac 
cacatatcat 
tctttataag 
gatgaaagta 
ttttttgacc 
ccctgcattc 
aaaataatca 
ataagatgca 
tgaaacatgt 
actcttacac 
atttagagtg 
tcataattat 
agatagaagg 
tacaaaaaaa 
gctactcggg 
gccatggaca 
aatgaaaagg 
atatgtacca 
gtataattgc 
gattgaagat 
atgcaatgtc 
tttgggccag 
ttatcactta 
aaattagtac 
ttaactgaat 
ctcagtttag 
tttgtgtatg 
gacatatacc 
ggaaaaggtt 
tgagggaagt 
cagatcagtt 
atatgttctg 
ctgtgtctcc 
agaataaagt 
gaaaatatgc 



attgcttaat 

taaaattgtg 

taagctacgc 

tttctgtttt 

caagttcaca 

cgtgtgcctt 

ctgtagtgtg , 

tgctcagcgg 

actttcagag 

attttgcatt 

ttcgtcatga 

tatgtatcgc 

gaacaatata 

atttaaactt 

tttttgctgc 

caaaatgaaa 

atgccagaga 

ttgtgtacat 

atcggggcat 

ctgtaaagta 

ttgaatataa 

tggattctaa 

tttcatagaa 

attttgtatc 

ttgaattatt 

ttaaactttt 

aatgccgtcg 

aaaaaaaaaa 

ctaaaattca 

ttcacaatat 

aaactgttaa 

tggccgatgt 

caaatgcctt 

catacatact 

gcaacaattt 

attctttgaa 

aaataaggcc 

ttaaaaaaaa 

agggtgaggt 

taccactgca 

aaatatagaa 

caaaaaatgt 

aagcgcatag 

tgagtgaaat 

gttgtagttt 

ttttcattac 

gtataattga 

tttggtcaaa 

ttaaaacctt 

agtagctaca 

aaaagtaaat 

ttgttattat 

atttttagga 

ataatatgtg 

tttccatccg 

caattttata 

agcaggcaag 

ctgacttgtg 

atttcaaaaa 



ttgcacaccc 
actctgattc 
tattagtact 
gttttgtttt 
agatttttaa 
ttgcctgctt 
gtggattccc 
cttgcacgta 
ttgaactctt 
tgtctgtgtc 
ctgaacgaat 
cttagagcaa 
tcccatggga 
cgatttcctc 
agaaatatat 
gaaataatga 
ataaactctt 
tagaatgtat 
gggtaaaact 
aaatttagat 
tgggtatgtt 
ctcattttaa 
ctatactaat 
aaaatgtttg 
ttctaaaatt 
gtttgtatta 
atgtgcatgc 
taaaatcttt 
ccaacagtct 
ttttccctaa 
gcagtatatt 
taatttgctt 
aaattaacta 
gatagttttt 
tgcttaatat 
taggtctgtg 
aagttcaaga 
ttagccaggc 
gggaggatcg 
ctacagccta 
atataaaatt 
gaaaagagag 
taaaataatt 
attttcttgg 
tgcatggctt 
gagtaactca 
cattatatag 
atatttacaa 
caactattat 
actcttcgat 
ctattcctgt 
aatatgtata 
aaaccacttc 
gaacaaactc 
gattattatt 
aatgttcatg 
aatacttgac 
tttttgagat 
naaaaaaaaa 



tgtacacaca 
attatagcag 
tccctttgtc 
tttaatatca 
ggtgatgaga 
ctctttgatt 
actgggctct 
gacttgctag 
taagcccttg 
aagaagttca 
ttgtttattt 
gagctgtgtt 
gaagaccttt 
atagtccttt 
cagtggccca 
ttaagggaaa 
tcaagcatca 
gcacacatcc 
tatgaaaatt 
caattccatg 
ctaaatttga 
caaggtagcc 
cttctcacaa 
gaaaattaga 
aagagccgta 
ttactgattt 
ttttatgttt 
cactgtctct 
cccagtacat 
gctttgagca 
agtttggtta 
ggcaattctg 
agttggtgaa 
catatgtttc 
gggttacata 
tcaatcaagt 
ccagcctggg 
atggtggcgt 
cttcagccca 
ggtaacagca 
tgcttattat 
agaaatgtct 
ttaaccttaa 
cagatattcc 
gctttataaa 
cactttttga 
agactatgta 
cattcacata 
gaagtgctcg 
actatcatca 
agcaactggg 
ctataataat 
aaatagaaag 
tcaacaaaat 
ggttcatgat 
tcttttttta 
taactctttt 
tattggtgcc 
aaaaaaaaa 



1747 

1807 

1867 

1927 

1987 . 

2047 

2107 

2167 

2227 

2287 

2347 

2407 

2467 

2527 

2587 

-2647 
2707 
2767 
2827 

' 2887 
2947 

• 3007 

'3 067 
3127 
3187 
3247 
3307 
3367 
3427 
3487 
3547 
3607 
3667 
3727 
3787 
3847 
3907 
3967 
4027 
4087 
4147 
4207 
4267 
4327 
4387 
4447 
4507 
4567 
4627 
4687 
4747 
4807 
4867 
4927 
4987 
5047 
5107 
5167 
5226 
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<212> PRT 

<213> Homo sapiens 
<400> 70 

Met Arg Tyr Lfeu Leu Pro Ser Val Val Leu Leu Gly Thr Ala Pro Thr 

1^5 10 15 

Tyr Val Leu Ala Trp Gly Val Trp Arg Leu Leu Ser Ala Phe Leu Pro 

20 25 30 

Ala Arg Phe Tyr Gin Ala Leu Asp Asp Arg Leu Tyr Cys Val Tyr Gin 

35 40 45 

Ser Met Val Leu Phe Phe Phe Glu Asn Tyr Thr Gly Val Gin lie Leu 

50 55 60 

Leu Tyr Gly Asp Leu Pro Lys Asn Lys Glu Asn lie lie Tyr Leu Ala 
65 70 75 80 

Asn His Gin Ser Thr Val Asp Trp lie Val Ala Asp lie Leu Ala lie 

85 90 95 

Arg Gin Ash Ala Leu Gly His Val Afg Tyr Val Leu Lys Glu dly Leu 

100 105 110 

Lys Trp Leu Pro Leu Tyr Gly Cys Tyr Phe Ala Gin His Gly Gly lie 

115 120 125 

Tyr Val Lys Arg Ser Ala Lys Phe Asn Glu Lys Glu Met Arg Asn Lys 

130 135 140 

Leu-Gln Ser Tyr Val Asp. Ala Gly Thr Pro. Met. Tyr Leu Val lie Phe 
145 150 155 160 

Pro Glu Gly Thr Arg Tyr Asn Pro Glu Gin Thr Lys Val Leu Ser Ala 

165 lid 175 

Ser Gin Ala Phe Ala Ala Gin Arg Gly Leu Ala Val Leu Lys His Val 

180 185 190 

Leu Thr Pro Arg lie Lys Ala Thr His Val Ala Phe Asp Cys Met Lys 

195 200 205 

Asn Tyr Leu Asp Ala lie Tyr Asp Val Thr Val Val Tyr Glu Gly Lys 

210 215 220 

Asp Asp Gly Gly 
225 

<210> 71 

<211> 158 

<212> DNA 

<213> Homo sapiens 

<400> 71 

gccttgcagt attaaaacat gtgctaacac cacgaataaa ggcaactcac gttgcttttg 60 

attgcatgaa gaattattta gatgcaattt atgatgttac ggtggtttat gaagggaaag 120 

acgatggagg gtagcgaaga gagtcaccga ccatgacg 158 

<210> 72 

<211> 1381 

<212> DNA 

<213> Mus musculus 

<220> 

<221> misc_binding 
<222> 608. .62? 

<223> amplification primer g34292.pu 
<221> iaisc_bihding 
<222> 740. .758 

<223> amplification primer g34292.rp 
<400> 72 

gagccgagag gatgctgctg tccctggtgc tccacacgta ctct atg cgc tac ctg 56 

Met Arg Tyr Leu 
1 

etc ccc age gtc ctg ttg ctg ggc teg gcg ccc ace tac ctg ctg gcc 104 

Leu Pro Ser Val Leu Leu Leu Gly Ser Ala Pro Thr Tyr Leu Leu Ala 

5 10 15 20 

tgg acg ctg tgg egg gtg etc tee gcg ctg atg ccc gcc cgc ctg tac 152 

Trp Thr Leu Trp Arg Val Leu Ser Ala Leu Met Pro Ala Arg Leu Tyr 
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488 



25 30 35 

cag cgc gtg gac gac egg ctt tac tgc gtc tac cag aac atg gtg etc 200 
Gin Arg Val Asp Asp Arg Leu Tyr Cys Val Tyr Gin Asn Met Val Leu 

40 45 50 

ttc ttc tte gag aac tae ace ggg gtc cag ata ttg eta tat gga gat 248 
Phe Phe Phe Glu Asn Tyr Thr Gly Val Gin lie Leu Leu Tyr Gly Asp 

55 60 65 

ttg cea aaa aat aaa gaa aat gta ata tat eta gcg aat cat caa age 296 
Leu Pro Lys Asn Lys Glu Asn Val lie Tyr Leu Ala Asn His Gin Ser 

70 75 80 

aca gtt gac tgg att gtt gcg gac atg ctg get gee aga cag gat gee 344 
Thr Val Asp Trp lie Val Ala Asp Met Leu Ala Ala Arg Gin Asp Ala 
85 90 95 100 

eta gga cat gtg cgc tac gta ctg aaa gac aag tta aaa tgg ctt ccg 392 
Leu Gly His Val Arg Tyr Val Leu Lys Asp Lys Leu Lys Trp Leu Pro 

105'' 110 115 

ctg tat ggg ttc tac ttt get cag cat gga gga att tat gta aaa ega 
Leu Tyr Gly Phe Tyr Phe Ala Gin His Gly Gly lie Tyr Val Lys Arg 

120 125 130 

agt gee aaa ttt aat gat aaa gaa atg aga age aag ctg cag age tat 
Ser Ala Lys Phe Asn Asp Lys Glu Met Arg Ser Lys Leu Gin Ser Tyr 

• 135 ^ 140 145 ... 

gtg aac gea gga aca ccg atg tat ctt gtg att ttc cca gag gga aca 536 
Val Asn Ala Gly Thr Pro Met Tyr Leu Val He Phe. Pro Glu Gly Thr 

150. 155 160 

agg tat aat gea aca tac aca aaa etc ctt tea gee agt cag gea ttt 
Arg Tyr Asn Ala Thr Tyr Thr Lys Leu Leu Ser Ala Ser Gin Ala Phe 
165 170 175 180 

get get cag egg gge ctt gea gta tta aaa cac gta ctg aca eca aga 
Ala Ala Gin Arg Gly Leu Ala Val Leu Lys His Val Leu Thr Pro Arg 

185 190 195 

ata aag gee act cae gtt get ttt gat tet atg aag agt cat tta gat 
He Lys Ala Thr His Val Ala Phe Asp Ser Met Lys Ser His Leu Asp 

200 205 210 

gea att tat gat gtc aca gtg gtt tat gaa ggg aat gag aaa ggt tea 
Ala He Tyr Asp Val Thr Val Val Tyr Glu Gly Asn Glu Lys Gly Ser 

215 220 225 

gga aaa tae tea aat cea cca tec atg act gag ttt etc tgc aaa cag 
Gly Lys Tyr Ser Asn Pro Pro Ser Met Thr Glu Phe Leu Cys Lys Gin 

230 235 240 

tgc cea aaa ctt cat att cae ttt gat cgt ata gac aga aat gaa gtt 
Cys Pro Lys Leu His He His Phe Asp Arg He Asp Arg Asn Glu Val 
245 250 255 260 

cea gag gaa caa gaa cac atg aaa aag tgg ctt cat gag cgc ttt gag 
Pro Glu Glu Gin Glu His Met Lys Lys Trp Leu His Glu Arg Phe Glu 

. 265 270 275 

ata aaa gat abg ttg etc ata gag ttc tat gat tea cca gat cca gaa 920 
He Lys Asp Airg Leu Leu He Glu Phe Tyr Asp Ser Pro Asp Pro Glu 

280 285 290 

aga aga aac aaa ttt cet ggg aaa agt gtt eat tec aga eta agt gtg 
Arg Arg Asn Lys Phe Pro Gly Lys Ser Val His Ser Arg Leu Ser Val 

295 300 305 

aag aag act tta cet tea gtg ttg ate ttg ggg agt ttg act gcg gtc 1016 
Lys Lys Thr Leu Pro Ser Val Leu He Leu Gly Ser Leu Thr Ala Val 

310 315 320 

atg ctg atg aeg gag tec gga agg aaa ctg tae atg gge acc tgg ttg 1064 
Met Leu Met Thr Glu Ser Gly Arg Lys Leu Tyr Met Gly Thr Trp Leu 
325 330 335 340 

tat gga ace etc ett gge tgc ctg tgg ttt gtt att aaa gea taa 
Tyr Gly Thr Leu Leu Gly Cys Leu Trp Phe Val He Lys Ala * 
345 350 355 



584 



632 



680 



728 



776 



824 



872 



968 



1109 
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gcaagtagca ggctgcagtc acagtctctt attgatggct acacattgta tcacattgtt 1169 
tcctgaatta aataaggagt tttcttgttg ttgttttttt tgttttgttt tgttctgttt 1229 
taagccttga tgatngnncn cnnnnnnnnn ncnantcnng ngaccacagc caacatgcat 1289 
ttgatttggg gcaaacacat gtggcttttc aggtgctggg gttgctggag acatggaagc 1349 
taagtggagt ttatgctgtt tttttttttt tt 13 81 

<210> 73 
<211> 15766 
<212> DNA 

<213> Mus itvusculus 
<220> 

<221> exon 
<222> 52. .121; 
<223> exon2 
<221> exon 
<222> 682. .797 
<223> exon3 
<221> exon 
<222> 2628. .2i717 
<223> exon4 
<221> exon 
<222> 7834. .7924 
. <223> exonS .. ..... 

<221> exon 

<222> 9804- .9965 

<223> exonS 

<221> exon 

<222> 11404. .11527 

<223> exon7 

<221> exon 

<222> 13539.. 14035 

<223> exonS 

<221> misc_feature 

<222> 13762. .13764 

<223> stop CDS 

<221> polyA_signal 

<222> 13835. .13839 

<223> AATAAA i potential 

<400> 73 ; 

tttttttttt ttaattgtca aagtcatgat tctttttgtt ttctctttta gatattgcta 60 
tatggagatt tgccaaaaaa taaagaaaat gtaatatatc tagcgaatca tcaaagcaca 120 
ggtttgtatt tcatttgatg aaatttgggt ttttctagaa atggtaaatg agcattaata 180 
tgtacacaca catacacaca aacacacata tgtacacaca catatgtttt aaagacagga 240 
tttcatgtga cccagaatgg cctcatactc tctgagtagc tgagaatgat tttaagcttg 3 00 
tgacacacct gccttcatct ccaaggtaca ggaattgcag gtgctttctt tgnnnnnnnn 3 60 
nntttttttt tttgagtttt ggggaggggg tatatttttt aatgtgtctg tagttggctt 420 
tgttttaagc attttaatca tactttattt ttaaaaaaac taaaagcttt tttaaggcta 480 
ggtcttgcta tgtggcccta gtgttcctgg gacttgctct gtacaccggg ttgactctga 540 
gcctgtgcgc cttctgcctc tgcctccata gttagattct caggacatgt tacaaagact 600 
gtgctgtgaa gatgagtttt tgttcctggg agggaaggtt ggagctgact tgtgaggtac 660 
tgacttgggt ctgccttaca gttgactgga ttgttgcgga catgctggct gccagacagg 720 
atgccctagg acatgtgcgc tacgtactga aagacaagtt aaaatggctt ccgctgtatg 780 
ggttctactt tgctcaggta aactttgtct ttgccctttt atttcaaact taacaccatt 840 
taatgaaact atatctgatt tttttgttta tgtgtttgtt ttatggtacc cgtgattgaa 900 
catggggtca tatgtgtgct actgagtgac agccttagtt cagacatttt ttaaagcgac 960 
ttttactagt atttttattt agaattctat atgtgtgcac atgcatatgt gtgcttgtgt 1020 
gcacacgtgg atgcatgtga ggtcgaagga caattttcag tacaagtgtg agtgtcactt 1080 
tttaggcacc ttccactctt attttgagac agtctcctag acctttgctg agttgcccag 1140 
gctagccggc cagtgagccc tgggcatcta ccggtctctg cctccttacc tttacttagg 1200 
ttacaagtgt gtgctgctac gcccagctgt ttactagatt ctagggatcc aaatgtgggt 1260 
cctcgtaact tgtgagacaa gtactttcca aactgagcca cctccctagc tcttcttcac 1320 
ggttcctgat ggtgtgtgtc tagatggctg gttgtccgta tatttaagtc cagtagcaga 13 80 
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aatacaaata 
ttcggaaatt 
agctcagagt 
ccccaaggtc 
agactgcagc 
gggtgttcca 
cttaaagacc 
aaggggggag 
atgattcagt 
tgtattatct 
tagtccatat 
atccttccct 
atgtgtttga 
ttaggtggcc 
accagtctct 
attttaagga 
ccatcggaaa 
aattaaatgt 
aggtacatat 
aatgtagttt 
agttattaaa 
atgtaaaacg 
tgaacgcagg 
aagttcttaa 
gttttgttga 
gtttctcctg 
agaggatgga 
agaagttaaa 
gttttctttt 
gtgtcagtat 
tttcaagaat 
gcatagctat 
ctttctctcg 
cagggcctcg 
tgttttgtgc 
ttgatgaagt 
gtctcaaggc 
tgctcttctg 
tcacctacaa 
aattgtattt 
aaaaaaattg 
acaaaggcag 
tgtgctagat 
ttgtgctgtt 
ctttagaatc 
atctgatgtc 
acacaaataa 
taattttttt 
aataccacat 
atagacatta 
atgaaagtgg 
ctagtaactg 
aatatgagga 
ggcgtgtgaa 
tggtttgcct 
gttatttt-tc 
taggtgtnct 
tccataagtg 
gctgggtgac 
agcacacacc 
cctcggcaga 



cctaggagtc 
gattcaaaag 
gtgaagtgtg 
aggatgtggg 
cttctccctg 
cc^gcagggc 
ctgtctccat 
ggggaaattt 
tctttaaagg 
tt?iatttatt 
tttaattcct 
cctggcctcc 
gtgttttgta 
tacaactgag 
gagaagactc 
tacttttaat 
aagcttttga 
gtattttcca 
tgtatatata 
tatgtttttt 
gaataattgc 
aagtgccaaa 
aacaccggta 
gtcattttga 
gataagctgt 
agtaacagtc 
gctcactgaa 
ttgctttctt 
aaaacaaaca 
tttacacaac 
gctgactgct 
ggtgaggact 
tgccctccac 
gcctgggctt 
tgtgcactgt 
cctggtacca 
at^tgaatgg 
tattctctct 
tgtggagtaa 
aacttttcag 
aataccagcc 
aagtgtgaga 
tctgcagtac 
ctttaagtct 
tcacactctt 
tacaaagatt 
tagtattcaa 
gatcaacaaa 
gaatcattaa 
ataattgctt 
ttcttggaag 
attgaatttt 
tatataacaa 
tgagggggtc 
tctccagctg 
attgttatct 
cacaaatacc 
ttgttaactc 
ctgggttcct 
agatactgct 
aagggcctct 



caatagaaag 
tagttagtga 
gagaaatgtg 
tggtactgct 
tgttctgagc 
ctcagctagt 
acacagtcac 
agtccataat 
ctctgagtgt 
ttatgtaact 
taaaggatgg 
tgtggcctct 
tctatgcatt 
ttatgggtgg 
gtgtctgctg 
tgacttggtg 
actaaatctt 
tgatgcagtt 
gttggcaata 
actcattagg 
tcttcttttt 
tttaatgata 
agtgcgcccg 
aaatatatta 
cctctggccg 
agcatgggct 
gccccaaaga 
tctgtgtaaa 
caaacccaga 
tgtttttctg 
gccaactgcc 
tgggcggctc 
ttaccaggcc 
cgaccaatta 
attaggttgt 
ttctagtttt 
aagagctcct 
agtgtctttt 
tggtcataaa 
cttttaatat 
tgttatagtg 
acttcagact 
agccatgagt 
taccctgcaa 
tctctttaca 
atgaaagaga 
gaatgacttc 
tcaaaaaata 
agtgagtaat 
tcattgctta 
gttctggaaa 
tctgcagttc 
agtattgatt 
tctgtttctt 
ctatgtgatg 
cttagaagcc 
ccaagctaaa 
tactgactct 
ccccaacacc 
catctgagga 
agtcccatac 



ctacaagtgc 
gtgacagaca 
ttttctcaca 
gtctcccaan 
cggcctttcc 
gcccttattt 
tgtggaagct . 
ggtgtcacac 
agacattatc 
gaatgcctgt 
aggtgtagac 
tttacgtatt 
cctggggccc 
tttgtgacca 
agccttctct 
aatgacagta 
ttaaagagaa 
ttacttgggc 
tttaaatact 
agtacagttg 
tcttttctgt 
. aagaaatgag 
cttttattcc 
ccccatgtgg 
tgaggtaaga 
cgggacgggc 
gttagtcttc 
tttggatttt 
gcaaagagtc 
taaaggggga 
tctccccgtg 
ttgtctttct 
ctgggaagct 
ctagagcaga 
gttttcatca 
tacattctgg 
ctttacagcc 
tttttgtgtg 
catataaagt 
aactttttat 



gcatatgcct 
catactcagc 
gtccccatct 
cccactgtaa 
gacaccatgt 
aacttgtatg 
ttaaatgaac 
tttagattaa 
caatcttata 
gatataaact 
cgaaaatatt 
cataaagcat 
tttaaatttg 
gtcccttctc 
ggttctgatt 
aacatgttat 
caaccacatc 
tgtgagcagg 
ataccgtcca 
ctcttctcat 
ccttacgccc 



agaattgaca 
ggagctaaaa 
gttctgaagg 
cacccacctc 
cacatgtgga 
cacttaactg 
gaagcttcaa 
caatctctgt 
ttaattattt 
gtatatatgt 
ttttgtcttt 
tattattttt 
atggaggtca 
tggggtgctg 
ccagtcctgg 
gaaaatcaa^e 
aatattttaa 
tctgtagaaa 
aactgtcgct 
ccttaataac 
gtaccagcat 
aagcaagctg 
tcaaggcagg 
agcaatggaa 
ttgctgcagg 
aagggcaggc 
acatgagatt 
tattgtagaa 
tcctagtgaa 
aaaagaattc 
gcccctctct 
cctctctctg 
acacaccagg 
aacagcagca 
cctttgggtt 
gtagatagag 
attcgtgtag 
tgaatctgat 
acttatgcct 
ataataatta 
gtgttcctag 
tatatacaag 
tagagggaga 
gtacactctt 
cattgcccac 
cattctgtgt 
actgaatgaa 
atatctaaga 
agtgactgac 
ttattgatta 
tttcttactg 
ctggtcaatt 
gcggtgataa 
ttgggttctt 
atcttattat 
atcacctcca 
atgtcatgtt 
cccaattggc 
tcaaactgag 
ccacctaagg 
tcaccaatgc 



atcggtaatg 
gcagactctg 
ctgaaagtct 
tttggattat 
catccttggt 
taatgatttt 
tgtaagagtt 
agctgagtcc 
tgcccattta 
ttctggttcc 
ttaattttct 
aatttatttt 
gaaaaacaca 
ggacttgatc 
gagtgtggat 
gagttaggat 
gtgctaacaa 
taggattttc 
tgagttctga 
tacggagatt 
ggaggaattt 
cagagctatg 
ttaagaagtt 
ctggttcggg 
tgattgtaag 
cttagtgtgc 
cagttctaga 
attaaagttt 
gagtcattcc 
aaatcttctc 
gtatagacag 
cttctctacc 
caacagtgac 
gctgcagtgt 
ttgtgatgtt 
tttattcaag 
cat'gcataac 
gtcttgttat 
ttatctgcca 
atttatttta 
cactcaggag 
accccaaatt 
tcgctcatcc 
gctcacagtc 
tttattattt 
aaagtacttg 
tagtttgttc 
tacaaagcat 
cctaaaactc 
atacgttctc 
cttttttctt 
gctattatcc 
gacaagactg 
ttccttttgt 
atcttatttt 
ctcccaccat 
nctgtatact 
tttatcccta 
tccttttcca 
actgcctgct 
cttaggaaca 



1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040. 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
37-80 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
. 4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
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tgtgctcaat 
gcagcacacc 
tcaccaactc 
tacacctgga 
cagcagttct 
agtaaccctg 
acctgacttc 
gctggctttt 
cctaggtccc 
agatagnttg 
gacctttntg 
ttcctagttc 
atgtcatggt 
gaattttaag 
gataatggat 
gtgctgtatt 
tgtatcttga 
. ccagcaagaa 
tagtagtcta 
gtgtcttaaa 
ggatatttta 
tatacttcag 
actacaaagt 
tttctgtgtt 
aggagggttt 
tgatacagcc 
atggtaattc 
atcctctaca 
cccaggggag 
tgtgggactt 
taaacaaata 



ttcagaaaaa 
tgagcatgga 
tttacttgat 
tttacaatac 
tttaaccttt 
tattacagat 
tgatgtaaaa 
gagtactaaa 
atgtctataa 
cagacatata 
caaaagtcat 
gactattagc 
tgtagagcta 
ttggttactc 
aaatactacc 
tcaacacaaa 
acaaggtata 
cggggtaagt 
gtttccttca 
tccgatcttc 
caggcttgct 
atccccaaga 
acacgcacgt 
aagaaagtgg 
gagaggtaga 
tgtgccataa 
agtttttaca 
ttcaagtata 
tgaagtttca 
gtcatattca 



gcccctgtgg 
tataaagagg 
cactctatgc 
ttcagattga 
agcctgaatg 
accttgagga 
atcccacccg 
ccagcttcct 
atcacaagac 
gaatcttggc 
gaaagaacat 
aaactntact 
tcctgtctgc 
tataaataaa 
tttaaggcgt 
acttttgtca 
tttgtagatt 
cataggctgc 
taaacctcag 
gtttcacagt 
caacgtgata 
tttatctagt 
cttaagggcc 
acccactcct 
attgtggccc 
gaggccatcc 
tcaaggccca 
acctcccaac 
cagtttaaat 
tataccagct 
tatattctcc 
cactgtgttt 
taagtgacat 
gagtttagtt 
tcagttctca 
ccaggagtta 
actatattct 
tacactaggg 
ttcataagtt 
tggctgaaga 
gcagattcca 
gttttattac 
atattctaag 
tgttttacta 
tgctcaaagt 
tggccaataa 
ctaaagttta 
atgcaacata 
aaagatttaa 
gagtcatact 
tacagagtta 
atggcacatt 
aagtatactt 
gcggatgtat 
aggtcggggc 
aagaccaaga 
ctcgagtgtt 
attatcccat 
cacccagcaa 
ggacatctag 
tgccacgtga 



gtcatttccg 
ccttgcttgc 
aacccagtct 
tgaatctaca 
acagatgcta 
ttagaccagg 
ggccatagcc 
tcacctgtat 
catgtgcggt 
tctcatgtag 
ataactggga 
gcctgctcaa 
ttccacaatt 
cncttctttn 
cttctctgga 
ttggtataac 
aagagcacca 
tagctcagat 
gacccatccc 
tcccacagca 
ggcatatttt 
ccacctcagt 
atgggctcgg 
cttaaggtct 
atactttgtt 
gaggatggta 
cctgcttggt 
ataatgccac 
ccagaccagg 
ttgcagacaa 
tataggctcc 
tatgctagct 
agttcgtcct 
tgtcagttgg 
agttaggagt 
agtataaaaa 
tcttagttta 
gatgcttata 
tttaaatagt 
gtggagcttg 
ctagtatgtg 
tgggaaccag 
aaacagtttt 
taatccatat 
gaaggtcatt 
cagagttaga 
tttatgccca 
cacaaaactc 
ctgtattcag 
aaagaatatg 
ggtatattct 
tcccatgctg 
ttnggagtga 
atgtagcaca 
ccactgcagg 
aggctggagt 
ctgtctccct 
aaaatgtggg 
gtatctaaat 
tgccacatca 
agacgagagg 



tttacagtag 
tctcatattt 
gctctgccca 
acatcaccca 
cccaagtctc 
atgcacatcc 
catgctcagg 
cagacacaaa 
agtttggaaa 
ttgtactgat 
ggggctttga 
agcttgaggc 
ccctatcatt 



taaaacaaca 
taagaaaaaa 
ctgactggaa 
tgactaagaa 
ctctgtagat 
atggagttcc 
gcacctgccg 
gtctctagcc 
ctg;atggtct 
gtttattaga 
ggttgaaatc 
ggtacattct 
ctgttggctt 
gacttctttc 
cagctgggga 
ggacctgaaa 
atcacggcat 
tttagtgggt 
gtgtaggaga 
catgagtccc 
ccaccaatta 
tttgttatta 
cttatatcaa 
tatcatgaat 
aaatggaatg 
tacactgtta 
gatattataa 
tggttaatat 
aggggttggt 
naggatttta 
tatggtcggc 
tataaatgat 
attgctaaat 
cagatgtatc 
ctttcagcca 
aaaaacactt 



cgtttcttgt 
tattagtctg 
tgaattgagt 
ctcagtataa 
tatgtaaaca 
caagtctttt 
tgctcattcg 
tttgatacag 
ctcattgtgg 
acactgggaa 
tgcttcagaa 
aggccatgcc 



ggaaatttgc 
agctggagaa 
tgccagtcag 
ctccatgctt 
atctagttag 
tgcaccagtt 
ctccaccctc 
tagcaaaagg 
acagtctcca 
tagatcagtt 
gatgtaaagg 
atgaactctc 
aggggtccct 
agaacaacaa 
aaaagaatat 
gcaacttaaa 
ggcatagcag 
atgggaacag 
ttgtcttcca 
tctgggaacc 
ctgtaggttt 
tatagttcca 
gcagtaacac 
ctaataggaa 
atcatgcaag 
acatctgggt 
agttaagccc 
tcagctgttg 
acagagaact 
ttctttgtga 
gtttcatatc 
taataccgct 
tgtcctgttt 
aaaagtatca 
tatggcttca 
ctgttgactt 
atgaggttgc 
ttgtgagttt 
gcttcagtac 
gtgtactctg 
gtgctaataa 
tgtgctgatt 
aagattggct 
cttaattcaa 
acacattttc 
tgatggtacc 
ttgtgatttt 
gtcaggcatt 
ttttaagaag 
aagagctaag 
tgtctgagag 
taaaaatgta 
agcctggtgt 
caggtatatg 
agtgatgctg 
gcaaaggtca 
ttttcttgtt 
tcatcgtttt 
gaatcagtca 
ccgacctgca 
gtctgactta 



ctgataactt 
gataatgtac 
acgtgaatct 
ccttctaaat 
ccctgtccgg 
ccctttgtcc 
catgcacaaa 
ggtccacgtg 
cttgaggctc 
taggaagtat 
ccccacacaa 
actgttcctg 
ttccttcctg 
atctgacnct 
atttgcatag 
ggaagaagaa' 
cacaggtgca 
ggcaggaagc 
gtgatgtcct 
aacctgtggt 
atagccatcc 
acacttcaaa 
ctctactagc 
gcagcttgag 
ggtggcactg 
gggacaggaa 
catactctaa 
acagtgctgg 
gcagaggggc 
gcttggttca 
cacaaatttg 
gggagtcact 
ctgtattatg 
ttttattttt 
atattcacat 
agtaaatatc 
ttaaagtaag 
tttgaaacac 
tgctagatac 
tatattcatg 
aaatttaata 
ttaagtcagt 
ttaccataaa 
tctctgcagt 
tcaccatagg 
aacaatggac 
cccagaggga 
tgctgctcag 
agtgatcttt 
tgagagaata 
gttagagacg 
ggtaaatgat 
tataacataa 
canttgtaat 
agctaatgct 
gagctcactg 
tttaattatt 
cataaagtcc 
gctgatggct 
cttagtcagg 
ggatggaaat 



5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 
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ttccttcgag 
actaggcaga 
gatttgtttc 
ataatctcat 
cctgggatct 
aaaagaaaaa 
tctgtaaagg 
actttcccag 
ttctgccctt 
accctgcatt 
atgtcgccac 
tgttcatagc 
ccaggagagg 
tgcgtgctta 
agagaaggct 
catatctcat 
tggcaatgta 
atttgtggta 
cattgaatct 
aaaggccact 
cacagtggtt 
gactggtaag 
cttgtttgtt 
gaagctctgg 
cccagcacca 
tttataacat 
caacatgcta 
gctgtgaagt 
catttgaaca 
attcatttta 
gctgcagact 
tgactacagg 
gtgcagctca 
agcaacgaag 
tgaatattac 
ggtaaacagg 
aggaatgtct 
gccccactca 
actttcagga 
tggatgggat 
taatatttta 
ctgcgtacct 
tgggcttaac 
taagtgcata 
gttactttgt 
tagagtttct 
atgaagttcc 
aagataggta 
ctcagatcct 
ctgcatagac 
ttaacaatta 
tcaacacagc 
gtattgtttc 
tcacattgta 
gtttgctgga 
cacgaggctt 
aaggaaggga 
attacaagga 
gtgtttctac 
caatgtaaga 
caggcagtag 



caaacacgaa 
cattagtgga 
cctcatgtgt 
aaaaatactg 
agggtgggag 
taaaaagtct 
tcctgagaca 
cttggttgac 
gggactgaac 
ttaaaaattc 
tttccctgga 
acatcgaaag 
aggaagccga 
gcaagatgct 
cgcagtgggt 
tcatgtagat 
tgtgagcatt 
gttgataagt 
acctgctttc 
cacgttgctt 
tatgaaggga 
tccgtatttc. 
cgctttgctt 
ctggccggga 
atcacccact 
gtttgggaag 
gaatttaaag 
gacgtggctt 
tgggaacagg 
gagagagatg 
aaagagatct 
gccaggctga 
gtaggagagt 
ggaggcgagg 
cacactccag 
aggccaagaa 
gcactccagt 
ccccatctgt 
aggggaagca 
tggctttgac 
aaggtgaggt 
agaggaagga 
aa^agatatg 
gctttattgt 
ataatcacaa 
ctgcaaacag 
agaggaacaa 
agtggtaaga 
gcttgaaagg 
aagaccacat 
agtttaacat 
gatgcgttta 
aagaatgctt 
cattttcatg 
cagtaggctg 
tggggttcaa 
attgttcttt 
tagagagttg 
ctgacataac 
aggcctatgt 
ggacgcttct 



cgggctaggt 
ctgggtgtta 
tgacaccaca 
gttgaaccag 
gatcataaat 
ggtcaaggta 
agatggcctc 
ttttcagact 
tcagatatgt 
ttctgtagac 
gagaggcagt 
acctagtgct 
acagagtgga 
gctgctctcc 
cataatcttc 
gtttaatgga 
ctaggtgtga 
aatttagatg 
taggccttgc 
ttgattctat 
atgagaaagg 
catagaagct 
agcttggttc 
actcactatg 
cttatncttt 
gacattgtca 
actcagaact 
tgagtgagaa 
tgagaacgga 
agagtagtaa 
cttataatcg 
acaatctagt 
tgttgtctgc 
gtgggcaaaa 
taaatactct 
tattatgtca 
gtgatgaaga 
attgagtatt 
gattgatatt 
agtaaatgag 
tttctgttag 
tggctacttc 
tgctaatgag 
tttaaattct 
attctaaact 
tgcccaaaac 
gaacacatga 
gctccagcat 
agtcttttta 
atacttaaat 
cctacatcag 
ttccaaagtc 
tttgtgaact 
agtcactata 
ctgtgtgata 
gactgaagca 
tcataatgaa 
aaatgaagct 
ttttcaaccc 
atcggtaact 
gtgcttttgt 



cttagttata 
gaaggtacag 
tctaacctgc 
aaatggtgtt 
ttgaggccag 
acatggagcc 
tagtggcgaa 
tcatacaagt 
ggttgtggga 
agtcccacca 
gcagacttca 
tcctgtgaaa 
gggaatgctg 
tgtcgtgtct 
ccaaggacca 
tatgtgtcaa 
ggttatggca 
ttgactttca 
agtattaaaa 
gaagagtcat 
ttcaggaaaa 
gaatagtaca 
agtttggttt 
tagaccaagc 
tatgcntttt 
ttatttacaa 
cttgcctttg 
tagttcaggt 
gtgatggaag 
ggaagagaga 
cagtactaag 
aaaatcctaa 
cctacacaag 
tcgaacactt 
ccagagattt 
cactgaacat 
cttgaagttt 
cccctagtgc 
caaacccagc 
gaaatgtaaa 
tacgcagagt 
tccaaggctt 
gttttaattt 
tttagtctta 
agtaagacgt 
ttcatattca 
aaaagtggct 
ttagaaagtg 
tcttcattat 
gtagcatgtt 
tttgcctgtt 
gatagcacag 
cgggttaaat 
aaaatcatga 
agagcctttc 
cgggtgagca 
attgtcccct 
ttatattaga 
agccgctcag 
cactgttgta 
gcctgtcctg 



ggcatagtgt 
acaggcaaga 
tttttgagct 
gcaaagctat 
cttgggtctg 
tggaagtttc 
tgacttagct 
ttgtgaataa 
atggctttct 
tcctgtagct 
acccgcttct 
ttgtaagtac 
agttctgtcc 
ttcttgtcag 
gccttcccag 
tggggttgac 
ttaaacttta 
tgtattccta 
cacgtactga 
ttagatgcaa 
tactcaaatc 
tggtacaggt 
tcagtagagg 
tggccttggg 
tgtttttgct 
gaagaaatat 
tcagtgacaa 
aactatagcc 
attctggccc 
agagagagac 
gaggaagaag 
gtcaggaagt 
gcctggattt 
actcttggag 
cagatgagat 
gggatggaag 
agggacattt 
tcatctttat 
cagttttctt 
atgtaaaaga 
gagaggtttc 
gctgttagaa 
cagcttaata 
atgtttcatt 
gaaattttct 
ctttgatcgt 
tcatgagcgc 
cagttcaacc 
ttagtaaata 
tcatggtgcg 
gatttctgta 
caaaagtgaa 
cttattctat 
catggtggcc 
ctcttcagct 
caacaccttt 
ttcttgagtt 
tttatgcctt 
gattattttg 
gctctgtgga 
ctgttagaat 



ctgtggttat 
atttgctgta 
tctagtccta 
gatcccagct 
tcttagagaa 
acagggtgat 
gacaangaaa 
attacactcc 
ttcccacacc 
gttcttcctt 
ccctagtcgc 
atcctggagt 
taagaaagac 
aacttatcaa 
cttctcgcag 
ctaagtgaga 
atttccgtct 
attatgacca 
caccaagaat 
tttatgatgt 
caccatccat 
aagataaact 
gttccactat 
ctccactaca 
ttgagctttc 
ggtcttttcc 
agtgagaatg 
acagactcaa 
ctttcagaga 
gtggtatttt 
cagaagatga 
cagggctgag 
agctcccagt 
actcccttta 
tctgcttcct 
acatgttctg 
tccctccctg 
ttgtatgtta 
aaatactttg 
ttctaatttt 
ttactgatgt 
gtcagtgaca 
ctgcaaatca 
tttaccataa 
tcttctttgt 
atagacagaa 
tttgagataa 
aaattttact 
ctaatcatac 
ttacccttgt 
ccatgacaac 
actaaagtct 
cctttcgtgt 
tacctgcagt 
acacggggga 
gtgttgtggg 
agtagaaagt 
gtgttgtcac 
atgatgggaa 
ancggntcnn 
cttacagagg 



8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
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aggatgaatg aatgaccctt tttatttctc ttgtctgctt ttctaatttt atgggaataa 12420 

gaacttttgg taggtctctg tcactggcct cttgttgtga agagacacct tgagcaaagc 12480 

aactcttctg agagaaagca tttagttggg gaattcctta cagcttcaga ggttgagttc 12540 

gttttcatca tgctgaggac agggaggcac tcaggcagga gaagtagttg agagccacat 12600 

tctgatctac aggcagagag agacagactg agcctggcat gggttcttgg aacctcaaag 12660 

cctctcatcc ctaccccatc tcccgacccc tatacacact tcctccaaca aggctacacc 12720 

ttctaatcct tcttaaagag tcaccacatc cagcgactaa gcattcagat atgtgaacct 12780 

gtcggagcct ttcttactca gatcacctca ggaggaaaac tcctatgcta taagaatttc 12840 

ttttctttcg catctttgaa agcttgtttt tgtgtgatta gatcctggcc tcacacatgc 12900 

tcggcaatca ttttactgtt gagctccagc ctcagccgtt ttcattggct tatgggatgc 12960 

gagccatggg agagaagcta gaaggccttt cgttttatga gtcgggttgg tggaaccact 13020 

tacagatgga agatttacaa acaaaaatga agctggggcc atcaaggctc agcactcgct 13080 

gctcttccag agagttcagg ttcatttctc agtaaccaca tggtggcttt gtaaatgtaa 13140 

cttcatattc aatgaccctg acaccctctt ctggcctctg tgggcaccag acacaatcat 13200 

ggtatacaga cacacacact agccaacacc catctacata aaagtatata anacatatct 13260 
ttatcttaaa aatccccgaa gtcctcatta aatatcttag'atccccgccg tgttttgatt >-a3320 

tttgtttccc acgtggtgag gatataatat catgtccaaa ctgtaaggag tgaatgccct 133 80 

cccgtgcctc tcggacacct ctgcactcat ccaagttttc taaggagctg tacttgctca 13440 

gcaagtactc aatacctaat aaatggttta tgtttgtttc aacaccaaaa atgtccaaaa 13500 

ctgaaagatc aattctgttg ttttccttct ggccataggt tgctcataga gttctatgat 13560 

tcaccagatc cagaaagaag aaacaaattt cctgggaaaa gtgttcattc cagactaagt 13620 

gtgaagaaga ctttaccttc agtgttgatc ttggggagtt tgactgcggt catgctgatg 13680 

acggagtccg gaaggaaact gtacatgggc acctggttgt atggaaccct ccttggctgc 13740 

ctgtggtttg ttattaaagc ataagcaagt agcaggctgc agtcacagtc tcttattgat 13800 

ggctacacat tgtatcacat tgtttcctga attaaataag gagttttctt gttgttgttt 13860 

tttttgtttt gttttgttct gttttaagcc ttgatgattg aacactggat aaagtagagt 13920 

ttgtgaccac agccaacatg catttgattt ggggcaaaca catgtggctt ttcaggtgct 13980 

ggggttgctg gagacatgga agctaagtgg agtttatgct gntttttttt tttttttnaa 14040 

tgttttcatg aattaatgtc cacttgtaaa gattattgga tactttctgt aattcagaag 14100 

gttgtatttt aacactagtt tgcagtatgt ttcgctatat tggttatctt ccatttgact 14160 

acttggcagc tcagactctt .aatactaaag tattttacat tttgaagcta tgtgatactg 14220 

gttttttgtt gttgttgttg ttgttaattt ctgaaagtca atgaaagaca ctgtaatgat 14280 

gcgttaagat gttccaagaa aaaggtgaga attattcatg gcaaaaaaga tctgtctagt 14340 

gtatattttt attatattgc tctatttagc taattttctt tatatttgca aaataatgaa 14400 

catttttaat atttattaaa atgcttgatt tgcatacccc cgattctaca gagaataatg 14460 

tgtaaagtgt cagaatagac ttgaagctct gctgtgactc agtctccttt gtcagagctt 14520 

ctagtagccc agctactgag ctgctttgtt agtacctcca gcacctgagc cgttaagtac 14580 

ttataaatgc aagggacccg ttatcttcat atcggaatag acatgaacag agctctaagg 14640 

cgatgaaagt ctgccagcat cctctctgtc ctcgcacgtg ccttctgcct ggctccattt 14700 

gctttggcac tgcgttcgat ctagagtgta ggtgctcact gcttatttca gccctggctc 14760 

tgtggttttg tgtcctccag tggtgctgtt cactgttggg gtgcaggtgg tgctgccctg 14820 

actcagaggg gcagctccct ggctcctgag ggtgagcctt cttggctact acagaagtat 14880 

tgtgcgtttg tgtatggcaa gaaccatcag gattggataa atgtgttatt tctctttgat 14940 

ttccatggag ccacactgtt ggtacatgtc ccctgtgaac agagctacct ttcaggagca 15000, 

catcatactg tcgtgagtca cggcacggtg tgtcctgtga gaagaggctt tctaacgtgt 15060 

gatttgccgt gtttctatgt tgtgatttaa gcgtgattgc ctactagtca ttcaaggtaa 15120 

catttctgca aatttcatac agatttttgt cacaaaatta ctataccaat gatctagttg 15180 

aaatagacca attgaatcac aataaataat tttttttaat tgagggaaaa tttgcttctt 15240 

gttttttcaa agccagaaaa cgagccattt caaacatctt tgaagagtca tgtgctgtca 15300 

cttgttttct atgtgttagt gtctatattc atgtatggat acacatgaac atgtatattc 15360 

atacacacac gccaatagaa tataacagcc taaaaacaat ccagcttgtg tatcatgtta 15420 

ctgtgctgaa ttgtaatggt ttttacttac aaagtgaggc taaaatcgat ttcatgtctt 15480 

tgttaaatac gtttttttca gcaatcctat tagagcttat tttgaccaga tcaaaataag 15540 

tacaagttca gagactttaa atatggctga ggtctagagc gatagctcag tagttaggaa 15600 

cacatgccac tctttcaagg gcttcagttc ccagcactca tatggaggct cacagaaggc 15660 

tggaattcca gcttcatgga attggacaca tcctctagct tccatggatc tgtctgtctg 15720 

tctctccctt ctctctctct ctctctctct ctctctctct ctctct 15766 
<210> 74 
<211> 354 
<212> PRT 

<213> Mus musculus 
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<400> 74 

Met Arg Tyr Leu Leu Pro Ser Val Leu Leu Leu Gly Ser Ala Pro Thr 

15 10 15 

Tyr Leu Leu Ala Trp Thr Leu Trp Arg Val Leu Ser Ala Leu Met Pro 

2b 25 30 

Ala Arg Leu Tyr Gin Arg Val Asp Asp Arg Leu Tyr Cys Val Tyr Gin 

35 40 45 

Asn Met Val Leu Phe Phe Phe Glu Asn Tyr Thr Gly Val Gin He Leu 

50 55 60 . 

Leu Tyr Gly Asp Leu Pro Lys Asn Lys Glu Asn Val He Tyr Leu Ala 
65 ' 70 75 80 

Asn His Gin Ser Thr Val Asp Trp He Val Ala Asp Met Leu Ala Ala 

85 90 95 

Arg Gin Asp Ala Leu Gly His Val Arg Tyr Val Leu Lys Asp Lys Leu 

100 105 110 

Lys Trp Leu Pro Leu Tyr Gly Phe Tyr Phe Ala Gin His Gly Gly "He 

115 120 125 

Tyr Val Lys Arg Ser Ala Lys Phe Asn Asp Lys Glu Met Arg Ser Lys 

130 135 140 

Leu Gin Ser Tyr Val Asn Ala Gly Thr Pro Met Tyr Leu Val He Phe 
145 150 155 160 

Pro Glu Gly Thr Arg Tyr Asn Ala Thr Tyr Thr Lys Leu Leu Ser Ala 

165 170 175 

Ser Gin Ala Phe Ala Ala Gin Arg Gly Leu Ala Val Leu Lys His Val 

180 185 190 

Leu Thr Pro Arg He Lys Ala Thr His Val Ala Phe Asp Ser Met Lys 

195 200 205 

Ser His Leu Asp Ala He Tyr Asp Val Thr Val Val Tyr Glu Gly Asn 

210 215 220 

Glu Lys Gly Ser Gly Lys Tyr Ser Asn Pro Pro Ser Met Thr Glu Phe 
225 ' 230 235 240 

Leu Cys Lys Gin Cys Pro Lys Leu His He His Phe Asp Arg He Asp 

245 250 255 

Arg Asn Glu Val Pro Glu Glu Gin Glu His Met Lys Lys Trp Leu His 

260 265 270 

Glu Arg Phe gIu He Lys Asp Arg Leu Leu He Glu Phe Tyr Asp Ser 

275 280 285 

Pro Asp Pro Giu Arg Arg Asn Lys Phe Pro Gly Lys Ser Val His Ser 

290 295 300 

Arg Leu Ser Val Lys Lys Thr Leu Pro Ser Val Leu He Leu Gly Ser 
305 310 315 320 

Leu Thr Ala Val Met Leu Met Thr Glu Ser Gly Arg Lys Leu Tyr Met 

325 330 335 

Gly Thr Trp Leu Tyr Gly Thr Leu Leu Gly Cys Leu Trp Phe Val He 
340 345 350 

Lys Ala 
<210> 75 
<211> 22 
<212> DNA 

<213> Mus Musculus 
<220> 

<221> misc^binding 
<222> 1. .22 

<223> amplification oligonucleotide g34292.pu 
<400> 75 

attaaaacac gtactgacac ca 

<210> 76 

<211> 19 

<212> DNA 

<213> Mus Musculus 

<220> 



22 
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<221> itiisc_bihding 
<222> 1. .19 

<223> amplification oligonucleotide g34292.rp 
<400> 76 

agtcatggat ggtggattt 
<210> 77 
<211> 26 
<212> DNA 

<213> Homo Sapiens * 
<220> 

<221> misc_binding 
<222> 1..26 

<223> cimplif ication oligonucleotide BOXIed 
<400> 77 

aatcatcaaa gcacagttga ctggat 
<210> 78 
<211> 33 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221>' misc_binding 
<222>. 1. .33 . 

<223> amplification oligonucleotide BOXIIIer 
<400> 78 

ataaaccacc gtaacatcat aaattgcatc taa 
<210>'79 
<211> 22 
<212> DNA 

<213> Mus Musculus 
<220> 

<221> misc_binding 
<222> 1. .22 

<223> sequencing oligonucleotide moPGrace3S473 ^. 
<400> 79 

gagataaaag ataggttgct ca 

<210> 80 

<211> 19 

<212> DNA 

<213> Mus Musculus 

<220> 

<221> misc_binding 
<222> 1. .19 

<223> sequencing oligonucleotide moPGrace3S526 
<400> 80 

aagaaacaaa tttcctggg 

<210> 81 

<211> 18 

<212> DNA 

<213> Mus Musculus 

<220> 

<221> misc_bihding 
<222> 1. .18 

<223> sequencing oligonucleotide moPGrace3S597 
<400> 81 

tcttggggag tttgactg 

<210> 82 

<211> 18 

<212> DNA 

<213> Mus Musculus 

<220> 

<221> misc_binding 
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<222> 1. .18 

<223> sequencing oligonucleotide inoPGrace5R323 
<400> 82 

gaccccggtg tagttctc 
<210> 83 
<211> 17 
<212> DNA 

<213> Mus Mus.culus 
<220> 

<221> niisc_binding 
<222> 1. .17 

<223> sequencing oligonucleotide inoPGrace5R372 
<400> 83 

cagtaaagcc ggtcgtc 

<210> 84 

<211> 17 

<212> DNA 

<213> Mus Musculus 

<220> 

<221> misc^binding 
<222> 1. .17 

<223> sequencing oligonucleotide inoPGrace5R444 
<400> 84 

caggccagca ggtaggt 
<210> 85 
<211> 19 
<212> DNA 

<213> Mus Musculus 
<220> 

<221> misc_binding 
<222> 1. .19 

<223> sequencing oligonucleotide moPGrace5R492 
<400> 85 

agcaggtagc gcatagagt 

<210> 86 

<211> 27 

<212> DNA 

<213> Mus Musculus 

<220> 

<221> misc_binding 
<222> 1. .27 

<223> amplification oligonucleotide moPG13LR2 
<400> 86 

ggaaacaatg tgatacaatg tgtagcc 

<210> 87 

<211> 18 

<212> DNA 

<213> Mus Musculus 

<220> 

<221> misc_binding 
<222> 1. .18 

<223> amplification oligonucleotide moPGlS 
<400> 87 

tggcgagccg agaggatg 

<210> 88 

<211> 36 

<212> DNA 

<213> Mus Musculus 

<220> 

<221> misc_binding 
<222> 1. .36 
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<223> ait^lif ication oligonucleotide moPGlBBaml 
<400> 88 

cgtggatccg gaaacaatgt gatacaatgt gtagcc 
<210> 89 
<211> 27 
<212> DNA 

<213> Mus Musculus 
<220> 

<221> raisc_binding 
<222> 1. .27 

<223> amplification oligonucleotide moPGlSEcol 
<400> 89 

cgtgaattct ggcgagccga gaggatg 
<210> 90 
<211> 20 
<212> DNA 

<213> Mus Musculus 
<220> 

<221> inis,c_bihding 
<222> 1. .20 

<223> amplification oligonucleotide moPGlRACE3 , 18 
<400> 90 , 

ctgccagaca ggatgcccta 
<210> 91. 
<211> 23 
<212> DNA 

<213> Mus Musculus 
<220> 

<221> inisc_binding 
<222> 1. .23 

<223> amplification oligonucleotide moPGlRACE3 . 63 
<400> 91 

acaagttaaa atggcttccg ctg 
<210> 92 
<211> 18 
<212> DNA 

<213> Mus Musculus 
<220> 

<221> misc_binding 
<222> 1. .18 

<223> sequencing oligonucleotide raoPGlRACE3R94 
<400> 92 

caaatgcatg ttggctgt 
<210> 93 
<211> 20 
<212> DNA 

<213> Mus Musculus 
<220> 

<221> raisc_binding 
<222> 1. .20 

<223> amplification oligonucleotide moPGlRACEB . 276 
<400> 93 

gcaaatgcct gactggctga 
<210> 94 
<211> 22 
<212> DNA 

<213> Mus Musculus 
<220> 

<221> raisc_binding 
<222> 1. .22 

<223> amplification oligonucleotide moPGlRACES .350 
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<400> 94 

aatcaaaagc aacgtgagtg gc 22 
<210> 95 
<211> 20 
<212> DNA 

<213> Mus Musculus 
<220> 

<221> misc^binding 
<222> 1. .20 

<223> amplification oligonucleotide nioPG3RACE2 
<400> 95 

tgggcacctg gttgtatgga 20 
<210> 96 
<211> 20 
<212> DNA 

<213> Mus Muspulus ^ 
<220> 

<221> misc_bijiding 
<222> 1. .20 

<223> amplification oligonucleotide moPG3RACE2n 
<400> 96. 

tccttggctg ccj^gtggttt 20 

<210> 97 

<211> 21 

<212> DNA 

<213> Mus Musculus 

<220> 

<221> misc_binding 
<222> 1. .21 

<223> sequencing oligonucleotide moPG3RACES20 
<400> 97 

gatggctaca cattgtatca c 21 

<210> 98 

<211> 24 

<212> DNA 

<213> Mus Musculus 

<220> 

<221> misc_binding 
<222> 1,.24 

<223> sequencing oligonucleotide moPG3RACES5 
<400> 98 

tcctgaatta aabaaggagt tttc 24 

<210> 99 ' 

<211> 24 

<212> DNA 

<213> Mus Musculus 

<22o> ; 

<221> raisc_bihding 
<222> 1. .24 ' 

<223> sequencing oligonucleotide moPG3RACES90 
<400> 99 

gtttgttatt aaagcataag caag 24 
<210> 100 
<211> 216 
<212> DNA 

<213> Homo sapiens 
<400> 100 

ctgctgtccc tggtgctcca cacgtactcc atgcgctacc tgctgcccag cgtcgtgctc 60 
ctgggcacgg cgcccaccta cgtgttggcc tggggggtct ggcggctgct ctccgccttc 120 
ctgcccgccc gcttctacca agcgctggac gaccggctgt actgcgtcta ccagagcatg 180 
gtgctcttct tcttcgagaa ttacaccggg gtccag 216 



wo 99/32644 



52 



PCT/IB98/02133 



<210> 101 
<211> 70 
<212> DNA 

<213> Homo sapiens 
<400> 101 



60 
70 



60 



90 



60 



atattgctat atggagattt gccaaaaaat aaagaaaata taatatattt agcaaatcat 
caaagcacag 
<210> 102 
<211> 116 
<212> DNA 
<213> Homo sapiens 
<400> 102 I 

ttgactggat tgttgctgac atcttggcca tcaggcagaa tgcgctagga catgtgcgct 
acgtgctgaa ag^agggtta aaatggctgc cattgtatgg gtgttacttt gctcag 116 
<210> 103 : 
<211> 90 
<212> DNA 
<213> Homo sapiens 
<400> 103 . 

catggaggaa tctatgtaaa gcgcagtgcc aaatttaacg agaaagagat gcgaaacaag 60 
ttgcagagct acgtggacgc aggaactcca 
<210> 104.. 
<211> 91 
<212> DNA . 
<213> Homo sapiens 
<400> 104 

atgtatcttg tgatttttcc agaaggtaca aggtataatc cagagcaaac aaaagtcctt 
tcagctagtc aggcatttgc tgcccaacgt g 91 
<210> 105 
<211> 159 
<212> DNA 
<213> Homo sapiens 
<400> 105 

gccttgcagt attaaaacat gtgctaacac cacgaataaa ggcaactcac gttgcttttg 60 

attgcatgaa gaattattta gatgcaattt atgatgttac ggtggtttat gaagggaaag 120 

acgatggagg gc^gcgaaga gagtcaccga ccatgacgg 159 

<210> 106 ! 

<211> 124 

<212> DNA 

<213> Homo sapiens 

<400> 106 I 

aatttctctg caaagaatgt ccaaaaattc atattcacat tgatcgtatc gacaaaaaag 60 

atgtcccaga agaacaagaa catatgagaa gatggctgca tgaacgtttc gaaatcaaag 120 

ataa i 124 

<210> 107 

<211> 4342 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> polyA_signal 

<222> 325. .330 

<223> AATAAA potential 

<221> polyA„signal 

<222> 694. .699 

<223> AATAAA potential 

<221> polyA_signal 

<222> 828, .833 

<223> AATAAA potential 

<221> polyA_signal 

<222> 1821.. 11826 

<223> AATAAA I potential 
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<221> polyA_signal 
<222> 2480. .2485 
<223> AATAAA | potential 
<221> polyA_sp.gnal 
<222> 2800, .21805 
<223> AATAAA j potential 
<221> polyA„signal 
<222> 4264. .41269 
<223> AATAAA j potential 
<221> polyA_siLgnal 
<222> 4320. .41315 
<223> AATAAA : 
<400> 107 

gatgcttata gaattttatg agtcaccaga tccagaaaga agaaaaagat ttcctgggaa 60 

aagtgttaat tccaaattaa gtatcaagaa gactttacca tcaatgttga tcttaagtgg 120 

tttgactgca ggcatgctta tgaccga"tgc tggaaggaag ctgtatgfe^a acacctggat 180 ^■ 

atatggaacc ctacttggct gcctgtgggt tactattaaa gcatagacaa gtagctgtct 240- 

ccagacagtg ggatgtgcta cattgtctat ttttggcggc tgcacatgac atcaaattgt 300 

ttcctgaatt tattaaggag tgtaaataaa gccttgttga ttgaagattg gataatagaa 360 

tttgtgacga aagctgatat gcaatggtct tgggcaaaca tacctggttg tacaacttta 420 

gcatcggggc tgctggaagg gtaaaagcta aatggagttt ctcctgctct gtccatttcc 480 

tatgaactaa tgacaacttg agaaggctgg gaggattgtg tattttgcaa gtcagatggc 540 

tgcatttttg agcattaatt tgcagcgtat ttcacttttt ctgttatttt caatttatta 600 

caacttgaca gctccaagct cttattacta aagtatttag tatcttgcag ctagttaata 660 

tttcatcttt tgcttatttc tacaagtcag tgaaataaat tgtatttagg aagtgtcagg 720 

atgttcaaag gaaagggtaa aaagtgttca tggggaaaaa gctctgttta gcacatgatt 780 
ttattgtatt gcgttattag ctgattttac tcattttata tttgcaaaat aaatttctaa 840 
tatttattga aattgcttaa tttgcacacc ctgtacacac agaaaatggt ataaaatatg 900 
agaacgaagt ttaaaattgt gactctgatt cattatagca gaactttaaa tttcccagct 960 

ttttgaagat ttaagctacg ctattagtac ttccctttgt ctgtgccata agtgcttgaa 1020 

aacgttaagg ttttctgttt tgttttgttt ttttaatatc aaaagagtcg gtgtgaacct 1080 

tggttggacc cc^agttcac aagattttta aggtgatgag agcctgcaga cattctgcct 1140 

agatttacta gcgtgtgcct tttgcctgct tctctttgat ttcacagaat attcattcag 1200 

aagtcgcgtt tcbgtagtgt ggtggattcc cactgggctc tggtccttcc cttggatccc 1260 

gtcagtggtg ctgctcagcg gcttgcacgt agacttgcta ggaagaaatg cagagccagc 1320 

ctgtgctgcc cactttcaga gttgaactct ttaagccctt gtgagtgggc ttcaccagct 1380 

actgcagagg cattttgcat ttgtctgtgt caagaagttc accttctcaa gccagtgaaa 1440 

tacagactta attcgtcatg actgaacgaa tttgtttatt tcccattagg tttagtggag 1500 

ctacacatta atatgtatcg ccttagagca agagctgtgt tccaggaacc agatcacgat 1560 

ttttagccat ggaacaatat atcccatggg agaagacctt tcagtgtgaa ctgttctatt 1620 

tttgtgttat aatttaaact tcgatttcct catagtcctt taagttgaca tttctgctta 1680 

ctgctactgg atttttgctg cagaaatata tcagtggccc acattaaaca taccagttgg 1740 

atcatgataa gcaaaatgaa agaaataatg attaagggaa aattaagtga ctgtgttaca 1800 

ctgcttctcc catgccagag aataaactct ttcaagcatc atctttgaag agtcgtgtgg 1860 

tgtgaattgg tttgtgtaca ttagaatgta tgcacacatc catggacact caggatatag 1920 

ttggcctaat aatcggggca tgggtaaaac ttatgaaaat ttcctcatgc tgaattgtaa 1980 

ttttctctta cctgtaaagt aaaatttaga tcaattccat gtctttgtta agtacaggga 2040 

tttaatatat tttgaatata atgggtatgt tctaaatttg aactttgaga ggcaatactg 2100 

ttggaattat gtggattcta actcatttta acaaggtagc ctgacctgca taagatcact 2160 

tgaatgttag gtttcataga actatactaa tcttctcaca aaaggtctat aaaatacagt 2220 

cgttgaaaaa aattttgtat caaaatgttt ggaaaattag aagcttctcc ttaacctgta 2280 

ttgatactga cttgaattat tttctaaaat taagagccgt atacctacct gtaagtcttt 2340 

tcacatatca tttaaacttt tgtttgtatt attactgatt tacagcttag ttattaattt 2400 

ttctttataa gaatgccgtc gatgtgcatg cttttatgtt tttcagaaaa gggtgtgttt 2460 

ggatgaaagt aaaaaaaaaa ataaaatctt tcactgtctc taatggctgt gctgtttaac 2520 

attttttgac cctaaaattc accaacagtc tcccagtaca taaaataggc ttaatgactg 2580 

gccctgcatt cttcacaata tttttcccta agctttgagc aaagttttaa aaaaatacac 2640 

taaaataatc aaaactgtta agcagtatat tagtttggtt atataaattc atctgcaatt 2700 

tataagatgc atggccgatg ttaatttgct tggcaattct gtaatcatta agtgatctca 2760 

gtgaaacatg tcaaatgcct taaattaact aagttggtga ataaaagtgc cgatctggct 2820 

aactcttaca ccatacatac tgatagtttt tcatatgttt catttccatg tgatttttaa 2880 



wo 99/32644 



PCT/IB98/02133 



54 



aatttagagt ggbaacaatt ttgcttaata tgggttacat aagctttatt ttttcctttg 
ttcataatta tajttctttga ataggtctgt gtcaatcaag tgatctaact agactgatca 
tagatagaag ga&ataaggc caagttcaag accagcctgg gcaacatatc gagaacctgt 
ctacaaaaaa atjtaaaaaaa attagccagg catggtggcg tacactgagt agtttgtccc 
agctactcgg gajgggtgagg tgggaggatc gcttcagccc aggaggttga gattgcagtg 
agccatggac ataccactgc actacagcct aggtaacagc acgagacccc aactcttaga 
aaatgaaaag gaaatataga aatataaaat ttgcttatta tagacacaca gtaactccca 
gatatgtacc acaaaaaatg tgaaaagaga gagaaatgtc taccaaagca gtattttgtg 
tgtataattg caagcgcata gtaaaataat tttaacctta atttgttttt agtagtgttt 
agattgaaga ttgagtgaaa tattttcttg gcagatattc cgtatctggt ggaaagctac 
aatgcaatgt cgttgtagtt ttgcatggct tgctttataa acaagatttt ttctccctcc 
ttttgggcca gttttcatta cgagtaactc acactttttg attaaagaac ttgaaattac 
gttatcactt agtataattg acattatata gagactatgt aacatgcaat cattagaatc 
aaaattagta ctttggtcaa aatatttaca acattcacat acttgtcaaa tattcatgta 
attaactgaa tttaaaacct tcaactatta tgaagtgctc gtctgtacaa tcgctaattt 
actcagttta gagtagctac aactcttcga tactatcatc aatatttgac atcttetcca 
atttgtgtat gaaaagtaaa tctattcctg tagcaactgg ggagtcatat atgaggtcaa 
agacatatac cttgttatta taatatgtat actataataa tagctggtta tcctgagcag 
gggaaaaggt tatttttagg aaaaccactt caaatagaaa gctgaagtac ttctaatata 
ctgagggaag tataatatgt ggaacaaact ctcaacaaaa tgtttattga tgttgatgaa 
acagatcagt ttttccatcc ggattattat tggttcatga ttttatatgt gaatatgtaa 
gatatgttct gcaattttat. aaatgttcat gtcttttttt aaaaaaggtg ctattgaaat 
tctgtgtctc cabcaggcaa gaatacttga ctaactcttt ttgtctcttt atggtatttt 
cagaataaag tc^gacttgt gtttttgaga ttattggtgc ctcattaatt cagcaataaa 
ggaaaatatg catctcaaaa at 
<210> 108 
<211> 62 
<212> DNA 
<213> Homo sapiens 
<400> 108 I 

agattggatt cgitagattaa acttgagaaa caaaccataa aagtggaagg ccctctttaa 
ca 

<210> 109 
<211> 86 
<212> DNA 
<213> Homo sapiens 
<400> 109 

gagatggggg tctcgctgtg 
cagcctacca aaatgctgga 
<210> 110 
<211> 116 
<212> DNA 

<213> Homo sapiens 
<400> 110 

gctaaagcag tcctcctgag 
tccgtgttct ctittgtttcc 
<210> 111 i 
<211> 45 
<212> DNA : 



ttgcccaggc tggtcttgga ctcaagcaat ctgcctgtct 
ttatag 



tagttaggac tacagacata cacgtgccac 
ctgcctcctg ctcttccact tatctttgca 



cgcgcccagc 
tggcag 



<213> Homo sabiens 
<400> 111 i 
ggaaagacga tg&agggcag 
<210> 112 ' 
<211> 5138 
<212> DNA 
<213> Homo sapiens 
<220> i 
<221> misc_feature 
<222> 31. ,33 
<223> ATG 

<221> misc_feature 



cgaagagagt caccgaccat gacgg 



2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4342 



60 
62 



60 
86 



60 
116 



45 
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102 



150 



<222> 262.. 264 
<223> TAG 

<221> polyA_signal 
<222> 5111. .5116 
<223> AATAAA 
<400> 112 

ctgctgtccc tggtgctcca cacgtactcc atg cgc tac ctg ctg ccc age gtc 54 

Met Arg Tyr Leu Leu Pro Ser Val 
1 5 
gtg etc ctg gge acg gcg ecc acc tac gtg ttg gee tgg ggg gtc tgg 
Val Leu Leu Gly Thr Ala Pro Thr Tyr Val Leu Ala Trp Gly Val Trp 

10 15 20 

egg ctg etc tec gee ttc ctg ccc gee cgc ttc tac caa gcg ctg gac 
Arg Leu Leu Ser Ala Phe Leu Pro Ala Arg Phe Tyr Gin Ala Leu Asp 
25 30 35 40 

gac egg ctg tac tgc gtc tac cag^agc atg gtg etc tte ttc ttc gag 198 
Asp Arg Leu Tyr Cys Val Tyr Gin Ser Met Val Leu Phe Phe Phe Glu 

I . 45 50 55 

aat tac acc ggg gtc cag ttg act gga ttg ttg ctg aca tct tgg cca 246 
Asn Tyr Thr GELy Val Gin Leu Thr Gly Leu Leu Leu Thr Ser Trp Pro 

6b 65 70 

tea ggc aga atg cgc tag gacatgtgcg ctacgtgctg aaagaagggt , 294 
Ser Gly Arg Met Arg * 
75 i 

taaaatgget gccattgtat gggtgttact ttgctcagca tggaggaatc tatgtaaagc 354 

gcagtgccaa atttaacgag aaagagatgc gaaacaagtt gcagagctac gtggacgcag 414 

gaactccaat gtatcttgtg atttttccag aaggtacaag gtataatcca gagcaaacaa 474 

aagtcctttc agctagtcag gcatttgctg cecaacgtgg ccttgcagta ttaaaacatg 534 

tgctaacacc acgaataaag gcaactcacg ttgcttttga ttgcatgaag aattatttag 594 

atgcaattta tgatgttacg gtggtttatg aagggaaaga cgatggaggg cagcgaagag 654 

agtcaccgac catgaeggaa tttctetgea aagaatgtce aaaaattcat attcacattg 714 

ategtatcga caaaaaagat gtcccagaag aacaagaaca tatgagaaga tggctgcatg 774 

aacgtttcga aatcaaagat aagatgctta tagaatttta tgagtcacca gatccagaaa 834 

gaagaaaaag atttectggg aaaagtgtta attecaaatt aagtatcaag aagactttae 894 

cateaatgtt gatcttaagt ggtttgactg caggcatgct tatgaccgat gctggaagga 954 

agctgtatgt gaacacctgg atatatggaa ecctacttgg ctgeetgtgg gttactatta 1014 

aagcatagac aagtagctgt ctccagacag tgggatgtgc tacattgtct atttttggcg 1074 

gctgcaeatg aeateaaatt gtttcctgaa tttattaagg agtgtaaata aagccttgtt 1134 

gattgaagat tggataatag aatttgtgac gaaagctgat atgcaatggt ettgggcaaa 1194 

catacctggt tgtacaactt tagcatcggg gctgctggaa gggtaaaagc taaatggagt 1254 

ttctcctgct ctgtccattt ectatgaact aatgacaact tgagaaggct gggaggattg 1314 

tgtattttgc aabteagatg gctgcatttt tgagcattaa tttgcagcgt atttcacttt 1374 

ttctgttatt ttbaatttat tacaacttga cagctccaag etcttattac taaagtattt 1434 

agtatcttgc agptagttaa tatttcatet tttgettatt tetacaagtc agtgaaataa 1494 

attgtattta ggaagtgtca ggatgtteaa aggaaagggt aaaaagtgtt eatggggaaa _ 1554 

aagetetgtt tagcacatga ttttattgta ttgcgttatt agctgatttt actcatttta 1614 

tatttgcaaa ataaatttct aatatttatt gaaattgctt aatttgcaca ccctgtacac 1674 

acagaaaatg gtataaaata tgagaacgaa gtttaaaatt gtgactctga ttcattatag 1734 

cagaacttta aajtttcccag ctttttgaag atttaagcta cgctattagt acttcccttt 1794 

gtctgtgcca takgtgcttg aaaacgttaa ggttttctgt tttgttttgt ttttttaata 1854 

teaaaagagt cggtgtgaac cttggttgga ceecaagtte acaagatttt taaggtgatg 1914 

agagcctgca gacattctge ctagatttac tagcgtgtgc cttttgectg cttctctttg 1974 

atttcacaga atattcattc agaagtcgeg tttctgtagt gtggtggatt cccactgggc 2034 

tctggtcctt cccttggatc ccgtcagtgg tgctgctcag cggcttgcac gtagacttge 2094 

taggaagaaa tgcagagcca gcctgtgctg cceaetttca gagttgaact ctttaagecc 2154 

ttgtgagtgg gcttcaccag ctactgcaga ggcattttgc atttgtctgt gtcaagaagt 2214 

tcacettetc aagccagtga aatacagact taattcgtca tgactgaacg aatttgttta 2274 

tttcccatta ggtttagtgg agctacacat taatatgtat cgcettagag caagagctgt 233 4 

gttccaggaa ccagatcacg atttttagce atggaacaat atateecatg ggagaagacc 2394 

tttcagtgtg aactgttcta tttttgtgtt ataatttaaa cttcgatttc ctcatagtcc 2454 

tttaagttga catttctgct tactgctact ggatttttgc tgeagaaata tatcagtgge 2514 
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ccacattaaa cataccagtt ggatcatgat 
aaaattaagt gactgtgtta cactgcttct 
tcatctttga agagtcgtgt ggtgtgaatt 
tccatggaca ctcaggatat agttggccta 
atttcctcat gctgaattgt aattttctct 
atgtctttgt taagtacagg gatttaatat 
tgaactttga gaggcaatac tgttggaatt 
gcctgacctg cataagatca cttgaatgtt 
caaaaggtct ataaaataca gtcgttgaaa 
agaagcttct ccttaacctg tattgatact 
gtatacctac ctbtaagtct tttcacatat 
tttacagctt agjttattaat ttttctttat 
tttttcagaa aagggtgtgt ttggatgaaa 
tctaatggct gtpctgttta acattttttg 
cataaaatag gcttaatgac tggccctgca 
gcaaagtttt aaaaaaatac actaaaataa 
ttatataaat tcatctgcaa tttataagat 
ctgtaatcat takgtgatct cagtgaaaca 
gaataaaagt gccgatctgg ctaactctta 
ttcatttcca tgtgattttt aaaatttaga 
ataagcttta ttttttcctt tgttcataat 
agtgatctaa ctagactgat catagataga 
gggcaacata tcgagaacct gtctacaaaa 
cgtacactga gtagtttgtc ccagctactc 
ccaggaggtt gagattgcag tgagccatgg 
gcacgagacc ccaactctta gaaaatgaaa 
tatagacaca cagtaactcc cagatatgta 
tctaccaaag cagtattttg tgtgtataat 
taatttgttt ttagtagtgt ttagattgaa 
tccgtatctg gtggaaagct acaatgcaat 
aaacaagatt ttttctccct ccttttgggc 
tgattaaaga acttgaaatt acgttatcac 
gtaacatgca atcattagaa tcaaaattag 
atacttgtca aatattcatg taattaactg 
tcgtctgtac aatcgctaat ttactcagtt 
tcaatatttg acktcttttc caatttgtgt 
ggggagtcat atatgaggtc aaagacatat 
aatagctggt tajtcctgagc aggggaaaag 
aagctgaagt acktctaata tactgaggga 
aatgtttatt gajtgttgatg aaacagatca 
gattttatat gtgaatatgt aagatatgtt 
ttaaaaaagg tghtattgaa attctgtgtc 
ttttgtctct ttatggtatt ttcagaataa 
gcctcattaa ttcagcaata aaggaaaata 
<210> 113 
<211> 5224 
<212> DNA 
<213> Homo sapiens 
<220> 

<221> inisc„f eature 
<222> 31. .33 
<223> ATG 

<221> misc_f eature 
<222> 262. .264 
<223> TAG 

<221> polyA^signal 
<222> 5197. .5202 
<223> AATAAA 
<400> 113 

ctgctgtccc tgptgctcca 



aagcaaaatg 
cccatgccag 
ggtttgtgta 
ataatcgggg 
tacctgtaaa 
attttgaata 
atgtggattc 
aggtttcata 
aaaattttgt 
gacttgaatt 
catttaaact 
aagaatgccg 
gtaaaaaaaa 
accctaaaat 
ttcttcacaa 
tcaaaactgf' 
gcatggccga 
tgtcaaatgc 
caccatacat 
gtggcaacaa 
tatattcttt 
aggaaataag 
aaattaaaaa 
gggagggtga 
acataccact 
aggaaatata 
ccacaaaaaa 
tgcaagcgca 
gattgagtga 
gtcgttgtag 
cagttttcat 
ttagtataat 
tactttggtc 
aatttaaaac 
tagagtagct 
atgaaaagta 
accttgttat 
gttattttta 
agtataatat 
gtttttccat 
ctgcaatttt 
tccagcaggc 
agtctgactt 
tgcatctcaa 



aaagaaataa 
agaataaact 
cattagaatg 
catgggtaaa 
gtaaaattta 
taatgggtat 
taactcattt 
gaactatact 
atcaaaatgt 
attttctaaa 
tttgtttgta 
tcgatgtgca 
aaataaaatc 
tcaccaacag 
tatttttccc 
taagcagtat 
tgttaatttg 
cttaaattaa 
actgatagtt 
ttttgcttaa 
gaataggtct 
gccaagttca 
aaattagcca 
ggtgggagga 
gcactacagc 
gaaatataaa 
tgtgaaaaga 
tagtaaaata 
aatattttct 
ttttgcatgg 
tacgagtaac 
tgacattata 
aaaatattta 
cttcaactat 
acaactcttc 
aatctattcc 
tataatatgt 
ggaaaaccac 
gtggaacaaa 
ccggattatt 
ataaatgttc 
aagaatactt 
gtgtttttga 
aaat 



tgattaaggg 

ctttcaagca 

tatgcacaca 

acttatgaaa 

gatcaattcc 

gttctaaatt 

taacaaggta 

aatcttctca 

ttggaaaatt 

attaagagcc 

ttattactga 

tgcttttatg 

tttcactgtc 

tctcccagta 

taagctttga 

attagtttgg ^ 

cttggcaatt 

ctaagttggt 

tttcatatgt 

tatgggttac 

gtgtcaatca 

agaccagcct 

ggcatggtgg 

tcgcttcagc 

ctaggtaaca 

atttgcttat 

gagagaaatg 

attttaacct 

tggcagatat 

cttgctttat 

tcacactttt 

tagagactat 

caacattcac 

tatgaagtgc 

gatactatca 

tgtagcaact 

atactataat 

ttcaaataga 

ctctcaacaa 

attggttcat 

atgtcttttt 

gactaactct 

gattattggt 



2574 

2634 

2694 

2754 

2814 

2874 

2934 

2994 

3054 

3114 

3174 

3234 

3294 

3354 

3414 

3474 

3534 

3594 

3654 

3714 

3774 

3834 

3894 

3954 

4014 

4074 

4134 

4194 

4254 

4314 

4374 

4434 

4494 

4554 

4614 

4674 

4734 

4794 

4854 

4914 

4974 

5034 

5094 

5138 



cacgtactcc atg cgc tac ctg ctg ccc age gtc 
Met Arg Tyr Leu Leu Pro Ser Val 



54 
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1 5 

gtg etc ctg ggc acg gcg ccc acc tac gtg ttg gcc tgg ggg gtc tgg 102 
Val Leu Leu G^y Thr Ala Pro Thr Tyr Val Leu Ala Trp Gly Val Trp 



10 



egg 

Arg Leu Leu S 
25 



15 20 



ctg etc t=c gcc ttc ctg ccc gee cgc ttc tac caa gcg ctg gac 150 
sr Ala Phe Leu Pro Ala Arg Phe Tyr Gin Ala Leu Asp 
30 35 40 

gac egg ctg tac tgc gtc tac cag age atg gtg etc ttc ttc ttc gag 198 
Asp Arg Leu Tyr Cys Val Tyr Gin Ser Met Val Leu Phe Phe Phe Glu 

45 50 55 

aat tac acc g|gg gtc cag ttg act gga ttg ttg ctg aca tct tgg cea 246 
Asn Tyr Thr GfLy Val Gin Leu Thr Gly Leu Leu Leu Thr Ser Trp Pro 

60 65 70 

tea ggc aga atg cgc tag gacatgtgcg ctaegtgctg aaagaagggt 294 
Ser Gly Arg Met Arg * 
75 

taaaatggct gecattgtat gggtgttact ttgctcagga gatgggggte tcgctgtgtt 354 

gcccaggctg gtcttggact caagcaatct gcetgtctea gcctaeeaaa atgctggatt 414 

atagcatgga ggaatctatg taaagcgcag tgccaaattt aacgagaaag agatgcgaaa 474 

caagttgcag agctacgtgg aegcaggaac tccaatgtat cttgtgattt ttecagaagg 534 

tacaaggtat aatccagagc aaacaaaagt ectttcaget agtcaggcat ttgctgecca 594 

acgtggcctt gcagtattaa aacatgtget aacaccacga ataaaggcaa etcacgttgc 654 

ttttgattgc atgaagaatt atttagatgc aatttatgat gttacggtgg tttatgaagg 714 

gaaagacgat ggagggeagc gaagagagte accgaecatg acggaatttc tctgeaaaga 774 

atgtceaaaa atteatattc acattgatcg tatcgacaaa aaagatgtcc cagaagaaea 834 

agaacatatg agaagatggc tgcatgaacg tttcgaaatc aaagataaga tgcttataga 894 

attttatgag tcaccagatc eagaaagaag aaaaagattt cctgggaaaa gtgttaatte 954 

caaattaagt atcaagaaga etttaccatc aatgttgatc ttaagtggtt tgactgcagg 1014 

catgcttatg acpgatgctg gaaggaagct gtatgtgaae aectggatat atggaaccct 1074 

acttggctgc etgtgggtta ctattaaagc atagacaagt agctgtctcc agaeagtggg 1134 

atgtgctaca ttgtctattt ttggcggctg cacatgaeat caaattgttt ectgaattta 1194 

ttaaggagtg taaataaagc cttgttgatt gaagattgga taatagaatt tgtgacgaaa 1254 

gctgatatgc aajtggtcttg ggcaaacata cctggttgta caactttage atcggggctg 1314 

ctggaagggt aajaagctaaa tggagtttet cctgctetgt ccatttccta tgaactaatg 1374 

acaacttgag aaggctggga ggattgtgta ttttgcaagt cagatggctg catttttgag 1434 

cattaatttg cajgcgtattt cactttttet gttattttca atttattaca acttgaeagc 1494 

tccaagctct tafetactaaa gtatttagta tcttgcagct agttaatatt teatcttttg 1554 

cttattteta caagtcagtg aaataaattg tatttaggaa gtgteaggat gttcaaagga 1614 

aagggtaaaa agtgttcatg gggaaaaagc tetgtttage acatgatttt attgtattgc 1674 

gttattagct gattttactc attttatatt tgcaaaataa atttetaata tttattgaaa 1734 

ttgettaatt tgeacaccct gtaeacacag aaaatggtat aaaatatgag aacgaagttt 1794 

aaaattgtga ctctgattea ttatagcaga actttaaatt tcccagcttt ttgaagattt 1854 

aagctacgct attagtactt ccetttgtet gtgccataag tgcttgaaaa cgttaaggtt 1914 

ttetgttttg ttttgttttt ttaatatcaa aagagtcggt gtgaaccttg gttggaccce 1974 

aagtteaeaa gatttttaag gtgatgagag cctgeagaca ttctgcctag atttactagc 2034 

gtgtgccttt tgcctgcttc tctttgattt eacagaatat tcattcagaa gtcgcgtttc 2094 

tgtagtgtgg tggattccca etgggctctg gtccttccet tggatcccgt cagtggtget 2154 

gctcagcgge ttgcaegtag acttgctagg aagaaatgca gagccagcct gtgctgccea 2214 

ctttcagagt tgaactettt aageccttgt gagtgggctt caccagetae tgcagaggca 2274 

ttttgeattt gtctgtgtca agaagttcae cttctcaagc cagtgaaata eagacttaat 2334 

tegtcatgac tgaacgaatt tgtttatttc ceattaggtt tagtggagct acacattaat 2394 

atgtatcgcc ttagagcaag agctgtgttc caggaaecag ateacgattt ttagecatgg 2454 

aacaatatat cecatgggag aagacctttc agtgtgaact gttetatttt tgtgttataa 2514 

tttaaacttc gatttcctca tagtccttta agttgacatt tetgettact gctaetggat 2574 

ttttgctgca gaaatatate agtggeceac attaaacata ccagttggat catgataagc 2634 

aaaatgaaag aajataatgat taagggaaaa ttaagtgact gtgttacact gcttcteeca 2694 

tgecagagaa taaactettt caagcatcat ctttgaagag tcgtgtggtg tgaattggtt 2754 

tgtgtacatt agaatgtatg caeacatcca tggacactca ggatatagtt ggcctaataa 2814 

teggggcatg ggtaaaaett atgaaaattt cetcatgetg aattgtaatt ttetettace 2874 

tgtaaagtaa aaltttagate aatteeatgt ctttgttaag taeagggatt taatatattt 2934 

tgaatataat gggtatgtte taaatttgaa ctttgagagg caatactgtt ggaattatgt 2994 
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ggattctaac tckttttaac aaggtagcct gacctgcata agatcacttg aatgttaggt 
ttcatagaac tajtactaatc ttctcacaaa aggtctataa aatacagtcg ttgaaaaaaa 
ttttgtatca aa&tgtttgg aaaattagaa gcttctcctt aacctgtatt gatactgact 
tgaattattt tcjtaaaatta agagccgtat acctacctgt aagtcttttc acatatcatt 
taaacttttg- ttbgtattat tactgattta cagcttagtt attaattttt ctttataaga 
atgccgtcga tgjtgcatgct tttatgtttt tcagaaaagg gtgtgtttgg atgaaagtaa 
aaaaaaaaat aaaatctttc actgtctcta atggctgtgc tgtttaacat tttttgaccc 
taaaattcac caacagtctc ccagtacata aaataggctt aatgactggc cctgcattct 
tcacaatatt tttccctaag ctttgagcaa agttttaaaa aaatacacta aaataatcaa 
aactgttaag cagtatatta gtttggttat ataaattcat ctgcaattta taagatgcat 
ggccgatgtt aatttgcttg gcaattctgt aatcattaag tgatctcagt gaaacatgtc 
aaatgcctta aattaactaa gttggtgaat aaaagtgccg atctggctaa ctcttacacc 
atacatactg atagtttttc atatgtttca tttccatgtg atttttaaaa tttagagtgg 
caacaatttt gcttaatatg ggttacataa gctttatttt ttcctttgtt cataattata 
ttctttgaat aggtctgtgt caatcaagtg atctaactag actgatcata gatagaagga 
aataaggcca agttcaagac cagcctgggc aacatatcga gaacctgtct acaaaaaaat 
taaaaaaaat tagccaggca tggtggcgta cactgagtag tttgtcccag ctactcggga 
gggtgaggtg ggaggatcgc ttcagcccag gaggttgaga ttgcagtgag ccatggacat 
accactgcac tacagcctag gtaacagcac gagaccccaa ctcttagaaa atgaaaagga 
aatatagaaa tataaaattt gcttattata gacacacagt aactcccaga tatgtaccac 
aaaaaatgtg aaaagagaga gaaatgtcta ccaaagcagt attttgtgtg tataattgca 
agcgcatagt aaaataattt taaccttaat ttgtttttag tagtgtttag attgaagatt 
gagtgaaata ttjttcttggc agatattccg tatctggtgg aaagctacaa tgcaatgtcg 
ttgtagtttt gc^ggcttg ctttataaac aagatttttt ctccctcctt ttgggccagt 
tttcattacg agtaactcac actttttgat taaagaactt gaaattacgt tatcacttag 
tataattgac atjtatataga gactatgtaa catgcaatca ttagaatcaa aattagtact 
ttggtcaaaa tatttacaac attcacatac ttgtcaaata ttcatgtaat taactgaatt 
taaaaccttc aabtattatg aagtgctcgt ctgtacaatc gctaatttac tcagtttaga 
gtagctacaa ct|:ttcgata ctatcatcaa tatttgacat cttttccaat ttgtgtatga 
aaagtaaatc tattcctgta gcaactgggg agtcatatat gaggtcaaag acatatacct 
tgttattata atktgtatac tataataata gctggttatc ctgagcaggg gaaaaggtta 
tttttaggaa aaccacttca aatagaaagc tgaagtactt ctaatatact gagggaagta 
taatatgtgg aacaaactct caacaaaatg tttattgatg ttgatgaaac agatcagttt 
ttccatccgg attattattg gttcatgatt ttatatgtga atatgtaaga tatgttctgc 
aattttataa atgttcatgt ctttttttaa aaaaggtgct attgaaattc tgtgtctcca 
gcaggcaaga atacttgact aactcttttt gtctctttat ggtattttca gaataaagtc 
tgacttgtgt ttttgagatt attggtgcct cattaattca gcaataaagg aaaatatgca 
tctcaaaaat 
<210> 114 
<211> 4863 
<212> DNA 
<213> Homo sapiens 
<220> 

<221> misc_f eature 
<222> 31. .33 
<223> ATG 

<221> inisc_feature 
<222> 745. .74(7 
<223> TAG 

<221> polyA_sp.gnal 
<222> 4836. .4B41 
<223> AATAAA ! 
<400> 114 

ctgctgtccc tggtgctcca cacgtactcc 



gtg etc ctg ggc acg gcg ccc acc 
Val Leu Leu Gly Thr Ala Pro Thr 

10 15 
egg ctg etc tec gee ttc ctg ccc 
Arg Leu Leu Ser Ala Phe Leu Pro 



atg cgc tae ctg ctg ccc age gtc 
Met Arg Tyr Leu Leu Pro Ser Val 
1 5 
tae gtg ttg gcc tgg ggg gtc tgg 
Tyr Val Leu Ala Trp Gly Val Trp 
20 

gcc egc ttc tac caa gcg ctg gac 
Ala Arg Phe Tyr Gin Ala Leu Asp 



3054 
3114 
3174 
3234 
3294 
3354 
3414 
3474 
3534 
3594 
3654 
3714 
3774 
3834 
3894 
3954 
4014 
4074 
4134 
4194 
4254 
4314 
4374 
4434 
4494 
4554 
4614 
4674 
4734 
4794 
4854 
. 4914 
4974 
5034 
5094 
5154 
5214 
5224 



54 



102 



150 
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25 30 35 40 

gac egg ctg tac tgc gtc tac cag age atg gtg etc ttc ttc ttc gag 
Asp Arg Leu Tyr Cys Val Tyr Gin Ser Met Val Leu Phe Phe Phe Glu 

45 50 55 

aat tac ace ggg gtc cag cat gga gga ate tat gta aag cgc agt gcc 
Asn Tyr Thr Gly Val Gin His Gly Gly lie Tyr Val Lys Arg Ser Ala 

60 65 70 

aaa ttt aac gag aaa gag atg cga aac aag ttg cag age tac gtg gac 
Lys Phe Asn Glu Lys Glu Met Arg Asn Lys Leu Gin Ser Tyr Val Asp 

75 80 85 

gca gga act cca atg tat ett gtg att ttt cea gaa ggt aca agg tat 
Ala Gly Thr Pro Met Tyr Leu Val He Phe Pro Glu Gly Thr Arg Tyr 



90 



95 



100 



aat cea gag caa aca aaa gtc ett tea get agt cag gca ttt get gee 
Asn Pro Glu Gin Thr Lys Val Leu Ser Ala Ser Gin Ala Phe Ala Ala 
105 i 110 115 120 

caa cgt gaa tjtt etc tgc aaa gaa tgt cea aaa att eat att eac att 
Gin Arg Glu Phe Leu Cys Lys Glu Cys Pro Lys He His He His He 

i 125 130 135 

gat cgt ate giac aaa aaa gat gtc cca gaa gaa caa gaa cat atg aga 
Asp Arg He Aisp Lys Lys Asp Val Pro Glu Glu Gin Glu His Met Arg 

1140 145 150 . 

aga tgg ctg ckt gaa cgt ttc gaa ate aaa gat aag atg ett ata gaa 
A-rg Trp Leu Has Glu Arg Phe Glu He Lys Asp Lys Met Leu He Glu 

155 ' 160 165 

ttt tat gag tea cca gat cca gaa aga aga aaa aga ttt cet ggg aaa 
Phe Tyr Glu Ser Pro Asp Pro Glu Arg Arg Lys Arg Phe Pro Gly Lys 

170 175 180 

agt gtt aat tee aaa tta agt ate aag aag act tta cca tea atg ttg 
Ser Val Asn Ser Lys Leu Ser He Lys Lys Thr Leu Pro Ser Met Leu 
185 190 195 200 

ate tta agt ggt ttg act gca ggc atg ett atg ace gat get gga agg 
He Leu Ser Gly Leu Thr Ala Gly Met Leu Met Thr Asp Ala Gly Arg 

205 210 215 

aag ctg tat gtg aac ace tgg ata tat gga ace eta ett ggc tgc ctg 
Lys Leu Tyr Val Asn Thr Trp He Tyr Gly Thr Leu Leu Gly Cys Leu 

220 225 230 

tgg gtt act att aaa gca tag acaagtagct gtctccagac agtgggatgt 
Trp Val Thr He Lys Ala * 
235 

gctaeattgt ctatttttgg cggetgcaca tgacateaaa ttgtttcctg aatttattaa 
ggagtgtaaa ta^agccttg ttgattgaag attggataat agaatttgtg aegaaagetg 
atatgeaatg gtjcttgggca aacatacctg gttgtaeaac tttagcateg gggctgctgg 
aagggtaaaa gcttaaatgga gtttctectg ctctgtccat ttcctatgaa etaatgacaa 
cttgagaagg ctbggaggat tgtgtatttt geaagteaga tggctgcatt tttgageatt 
aatttgcage gt£ttteact ttttctgtta ttttcaattt attaeaactt gacagetcca 
agetcttatt acjtaaagtat ttagtatctt gcagctagtt aatatttcat ettttgctta 
tttctacaag teagtgaaat aaattgtatt taggaagtgt caggatgttc aaaggaaagg 
gtaaaaagtg ttcatgggga aaaagctctg tttageaeat gattttattg tattgegtta 
ttagetgatt ttactcattt tatatttgca aaataaattt ctaatattta ttgaaattgc 
ttaatttgca caiccctgtae acacagaaaa tggtataaaa tatgagaacg aagtttaaaa 
ttgtgactct gattcattat ageagaactt taaatttece agetttttga agatttaage 
taegctatta gtacttccct ttgtetgtgc eataagtget tgaaaacgtt aaggttttct 
gttttgtttt gtttttttaa tatcaaaaga gtcggtgtga aeettggttg gaccccaagt 
teacaagatt tttaaggtga tgagagcctg eagacattct gcctagattt actagcgtgt 
geettttgcc tgcttetctt tgatttcaca gaatattcat tcagaagtcg egtttetgta 
gtgtggtgga tteceactgg gctctggtec tteccttgga tecegtcagt ggtgctgetc 
agcggcttgc acgtagactt getaggaaga aatgcagage cagcctgtgc tgcecacttt 
cagagttgaa ctetttaagc cettgtgagt gggctteaee agctactgca gaggeatttt 
gcatttgtct gtgtcaagaa gttcacctte tcaagccagt gaaatacaga cttaattcgt 
catgaetgaa cgaatttgtt tatttcccat taggtttagt ggagetacac attaatatgt 



198 

246 

294 

342 

390 

438 

486 

534 

582 

630 

678 

726 

777 



837 
897 
957 
1017 
1077 
1137 
1197 
1257 
1317 
1377 
1437 
1497 
1557 
1617 
1677 
1737 
1797 
1857 
1917 
1977 
2037 
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60 



atcgccttag agcaagagct gtgttccagg aaccagatca cgatttttag ccatggaaca 2097 

atatatccca tgggagaaga cctttcagtg tgaactgttc tatttttgtg ttataattta 2157 

aacttcgatt tcctcatagt cctttaagtt gacatttctg cttactgcta ctggattttt 2217 

gctgcagaaa tatatcagtg gcccacatta aacataccag ttggatcatg ataagcaaaa 2277 

tgaaagaaat aatgattaag ggaaaattaa gtgactgtgt tacactgctt ctcccatgcc 2337 

agagaataaa ctctttcaag catcatcttt gaagagtcgt gtggtgtgaa ttggtttgtg 2397 

tacattagaa tgtatgcaca catccatgga cactcaggat atagttggcc taataatcgg 2457 

ggcatgggta aaacttatga aaatttcctc atgctgaatt gtaattttct cttacctgta 2517 

aagtaaaatt tagatcaatt ccatgtcttt gttaagtaca gggatttaat atattttgaa 2577 

tataatgggt at&ttctaaa tttgaacttt gagaggcaat actgttggaa ttatgtggat 2637 

tctaactcat ttpaacaagg tagcctgacc tgcataagat cacttgaatg ttaggtttca 2697 

tagaactata ctaatcttct cacaaaaggt ctataaaata cagtcgttga aaaaaatttt 2757 

gtatcaaaat gtttggaaaa ttagaagctt ctccttaacc tgtattgata ctgacttgaa 2817 

ttattttcta aakttaagag ccgtatacct acctgtaagt cttttcacat atcatttaaa 2877 

cttttgtttg tattattact gatttacagc ttagttatta atttttcttt ataagaatgc 2937 

cgtcgatgtg catgctttta tgtttttcag aaaagggtgt gtttggatga aagtaaaaaa 2997 

aaaaataaaa tctttcactg tctctaatgg ctgtgctgtt taacattttt tgaccctaaa 3057 

attcaccaac agtctcccag tacataaaat aggcttaatg actggccctg cattcttcac 3117 

aatatttttc cctaagcttt gagcaaagtt ttaaaaaaat acactaaaat aatcaaaact 3177 

gttaagcagt atattagttt ggttatataa attcatctgc aatttataag atgcatggcc 3237 

gatgttaatt tgcttggcaa ttctgtaatc attaagtgat ctcagtgaaa catgtcaaat 3297 

gccttaaatt aactaagttg gtgaataaaa gtgccgatct ggctaactct tacaccatac 3357 

atactgatag tttttcatat gtttcatttc catgtgattt ttaaaattta gagtggcaac 3417 

aattttgctt aatatgggtt acataagctt tattttttcc tttgttcata attatattct 3477 

ttgaataggt ctgtgtcaat caagtgatct aactagactg atcatagata gaaggaaata 3537 

aggccaagtt caagaccagc ctgggcaaca tatcgagaac ctgtctacaa aaaaattaaa 3597 

aaaaattagc caggcatggt ggcgtacact gagtagtttg tcccagctac tcgggagggt 3657 

gaggtgggag gatcgcttca gcccaggagg ttgagattgc agtgagccat ggacatacca 3717 

ctgcactaca gcctaggtaa cagcacgaga ccccaactct tagaaaatga aaaggaaata 3777 

tagaaatata aaatttgctt attatagaca cacagtaact cccagatatg taccacaaaa 3837 

aatgtgaaaa gagagagaaa tgtctaccaa agcagtattt tgtgtgtata attgcaagcg 3897 

catagtaaaa taattttaac cttaatttgt ttttagtagt gtttagattg aagattgagt 3957 

gaaatatttt cttggcagat attccgtatc tggtggaaag ctacaatgca atgtcgttgt 4017 

agttttgcat ggcttgcttt ataaacaaga ttttttctcc ctccttttgg gccagttttc 4077 

attacgagta actcacactt tttgattaaa gaacttgaaa ttacgttatc acttagtata 4137 

attgacatta tajiagagact atgtaacatg caatcattag aatcaaaatt agtactttgg 4197 

tcaaaatatt tacaacattc acatacttgt caaatattca tgtaattaac tgaatttaaa 4257 

accttcaact attatgaagt gctcgtctgt acaatcgcta atttactcag tttagagtag 4317 

ctacaactct tc^atactat catcaatatt tgacatcttt tccaatttgt gtatgaaaag 4377 

taaatctatt cctgtagcaa ctggggagtc atatatgagg tcaaagacat ataccttgtt 4437 

attataatat gtktactata ataatagctg gttatcctga gcaggggaaa aggttatttt 4497 

taggaaaacc acttcaaata gaaagctgaa gtacttctaa tatactgagg gaagtataat 4557 

atgtggaaca aaptctcaac aaaatgttta ttgatgttga tgaaacagat cagtttttcc 4617 

atccggatta ttkttggttc atgattttat atgtgaatat gtaagatatg ttctgcaatt 4677 

ttataaatgt tcatgtcttt ttttaaaaaa ggtgctattg aaattctgtg tctccagcag 4737 

gcaagaatac ttgactaact ctttttgtct ctttatggta ttttcagaat aaagtctgac 4797 

ttgtgttttt gagattattg gtgcctcatt aattcagcaa taaaggaaaa tatgcatctc 4857 

aaaaat 4863 
<210> 115 
<211> 5022 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> inisc_feature 
<222> 31. ,33 
<223> ATG 

<221> misc_feature 
<222> 904. .906 
<223> TAG 

<221> polyA_signal 
<222> 4995. .5bO0 
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<223> AATAAA 
<400> 115 

ctgctgtccc tggtgctcca cacgtactcc atg cgc tac ctg ctg ccc age gtc 54 
; Met Arg Tyr Leu Leu Pro Ser Val 

1 5 

gtg etc ctg ggc acg gcg ccc acc tac gtg ttg gcc tgg ggg gtc tgg 102 
Val Leu Leu cty Thr Ala Pro Thr Tyr Val Leu Ala Trp Gly Val Trp 

10 15 20 

egg ctg etc tzc gee tte ctg ccc gcc cgc ttc tac caa gcg ctg gac 150 
Arg Leu Leu Ser Ala Phe Leu Pro Ala Arg Phe Tyr Gin Ala Leu Asp 
25 r 30 35 40 

gac egg ctg tac tgc gtc tac cag age atg gtg etc ttc ttc ttc gag 198 
Asp Arg Leu Tyr Cys Val Tyr Gin Ser Met Val Leu Phe Phe Phe Glu 

45 50 55 

aat tac acc ggg gtc cag cat gga gga ate tat gta aag cgc agt gcc 246 
Asri Tyr Thr Gly Val Gin His Gly Gly He Tyr Val f^ys Arg Ser Ala 

60 65 70 

aaa ttt aae gag aaa gag atg cga aac aag ttg cag age tac gtg gac 294 
Lys Phe Asn Glu Lys Glu Met Arg Asn Lys Leu Gin Ser Tyr Val Asp 

75 80 ' 85 

gca gga act cca atg tat ctt gtg att ttt cea gaa ggt aca agg tat 342 
Ala Gly Thr Pro Met Tyr Leu Val He Phe Pro Glu Gly Thr Arg Tyr 

90 95 100 

aat cca gag caa aca aaa gtc ctt tea get agt cag gca ttt get gcc 390 
Asn Pro Glu Gin Thr Lys Val Leu Ser Ala Ser Gin Ala Phe Ala Ala 
105 110 115 120 

caa cgt ggc ctt gca gta tta aaa eat gtg eta aca cca cga ata aag 438 
Gin Arg Gly Leu Ala Val Leu Lys His Val Leu Thr Pro Arg He Lys 

; 125 130 135 

gca act cac gtt get ttt gat tgc atg aag aat tat tta gat gca att 486 
Ala Thr His vkl Ala Phe Asp Cys Met Lys Asn Tyr Leu Asp Ala He 



iko 145 150 

tat gat gtt acg gtg gtt tat gaa ggg aaa gac gat gga ggg cag cga 534 
Tyr Asp Val Thr Val Val Tyr Glu Gly Lys Asp Asp Gly Gly Gin Arg 

155 I 160 165 

aga gag tea ccg ace atg acg gaa ttt etc tgc aaa gaa tgt cca aaa 582 
Arg Glu Ser P^o Thr Met Thr Glu Phe Leu Cys Lys Glu Cys Pro Lys 

170 ! 175 180 

att cat att cac att gat cgt ate gac aaa aaa gat gtc cca gaa gaa 630 
He His He His He Asp Arg He Asp Lys Lys Asp Val Pro Glu Glu 
185 190 195 200 

caa gaa cat atg aga aga tgg ctg cat gaa cgt ttc gaa ate aaa gat 678 
Gin Glu His Met Arg Arg Trp Leu His Glu Arg Phe Glu He Lys Asp 

205 210 215 

aag atg ctt ata gaa ttt tat gag tea cca gat cca gaa aga aga aaa 726 
Lys Met Leu He Glu Phe Tyr Glu Ser Pro Asp Pro Glu Arg Arg Lys 

220 225 230 

aga ttt cct ggg aaa agt gtt aat tec aaa tta agt ate aag aag act 774 
Arg Phe Pro Gly Lys Ser Val Asn Ser Lys Leu Ser He Lys Lys Thr 

235 240 245 

tta cca tea atg ttg ate tta agt ggt ttg act gca ggc atg ctt atg 822 
Leu Pro Ser Met Leu He Leu Ser Gly Leu Thr Ala Gly Met Leu Met 

250 255 260 

acc gat get gga agg aag ctg tat gtg aac acc tgg ata tat gga acc 870 
Thr Asp Ala Gly Arg Lys Leu Tyr Val Asn Thr Trp He Tyr Gly Thr 
265 270 275 280 

eta ctt ggc tjc ctg tgg gtt act att aaa gca tag acaagtagct 916 
Leu Leu Gly Cys Leu Trp Val Thr He Lys Ala * 

285 290 
gtctccagac agjtgggatgt gctaeattgt etatttttgg cggetgeaca tgaeatcaaa 976 
ttgtttcctg aatttattaa ggagtgtaaa taaagccttg ttgattgaag attggataat 1036 
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agaatttgtg 
tttagcatcg 
ttcctatgaa 
tggctgcatt 
attacaactt 
aatatttcat 
caggatgttc 
gattttattg 
ctaatattta 
tatgagaacg 
agctttttga 
tgaaaacgtt 
accttggttg 
gcctagattt 
tcagaagtcg 
tcccgtcagt 
cagcctgtgc 
agctactgca 
gaaatacaga 
ggagctacac 
cgatttttag 
tatttttgtg 
cttactgcta 
ttggatcatg 
tacactgctt 
gtggtgtgaa 
atagttggcc 
gtaattttct 
gggatttaat 
actgttggaa 
cacttgaatg 
cagtcgttga 
tgtattgata 
cttttcacat 
atttttcttt 
gtttggatga 
taacattttt 
actggccctg 
acactaaaat 
aatttataag 
ctcagtgaaa 
ggctaactct 
ttaaaattta 
tttgttcata 
atcatagata 
ctgtctacaa 
tcccagctac 
agtgagccat 
tagaaaatga 
cccagatatg 
tgtgtgtata 
gtttagattg 
ctacaatgca 
ctccttttgg 
ttacgttatc 
aatcaaaatt 
tgtaattaac 
atttactcag 
tccaatttgt 
tcaaagacat 
gcaggggaaa 



acgaaagctg 
gggctgctgg 
ctkatgacaa 
ttbgagcatt 
gapagctcca 
ctjtttgctta 
aakggaaagg 
tattgcgtta 
ttgaaattgc 
aagtttaaaa 
agatttaagc 
aaggttttct 
gaccccaagt 
actagcgtgt 
cgtttctgta 
ggtgctgctc 
tgcccacttt 
gaggcatttt 
cttaattcgt 
attaatatgt 
ccatggaaca 
ttataattta 
ctggattttt 
ataagcaaaa 
ctcccatgcc 
ttggtttgtg 
taataatcgg 
ctbacctgta 
atkttttgaa 
tt^tgtggat 
tt^ggtttca 
aaaaaatttt 
ctgacttgaa 
atcatttaaa 
ataagaatgc 
aagtaaaaaa 
tgaccctaaa 
cattcttcac 
aatcaaaact 
atgcatggcc 
catgtcaaat 
tacaccatac 
gagtggcaac 
attatattct 
gaaggaaata 
aaaaattaaa 
tcgggagggt 
ggacatacca 
aapiggaaata 
tabcacaaaa 
atjtgcaagcg 
aa&attgagt 
atptcgttgt 
gcpagttttc 
acjttagtata 
agltactttgg 
tgaatttaaa 
tttagagtag 
gtatgaaaag 
ataccttgtt 
aggttatttt 



atatgcaatg 
aagggtaaaa 
cttgagaagg 
aatttgcagc 
agctcttatt 
tttctacaag 
gtaaaaagtg 
ttagctgatt 
ttaatttgca 
ttgtgactct 
tacgctatta 
gttttgtttt 
tcacaagatt 
gccttttgcc 
gtgtggtgga 
agcggcttgc 
cagagttgaa 
gcatttgtct 
catgactgaa 
atcgccttag 
atatatccca 
aacttcgatt 
gctgcagaaa 
tgaaagaaat 
agagaataaa 
tacattagaa 
ggcatgggta 
aagtaaaatt 
tataatgggt 
tctaactcat 
tagaactata 
gtatcaaaat 
ttattttcta 
cttttgtttg 
cgtcgatgtg 
aaaaataaaa 
attcaccaac 
aatatttttc 
gttaagcagt 
gatgttaatt 
gccttaaatt 
atactgatag 
aattttgctt 
ttgaataggt 
aggccaagtt 
aaaaattagc 
gaggtgggag 
ctgcactaca 
tagaaatata 
aatgtgaaaa 
catagtaaaa 
gaaatatttt 
agttttgcat 
attacgagta 
attgacatta 
tcaaaatatt 
accttcaact 
ctacaactct 
taaatctatt 
attataatat 
taggaaaacc 



gtcttgggca 
gctaaatgga 
ctgggaggat 
gtatttcact 
actaaagtat 
tcagtgaaat 
ttcatgggga 
ttactcattt 
caccctgtac 
gattcattat 
gtacttccct 
gtttttttaa 
tttaaggtga 
tgcttctctt 
ttcccactgg 
acgtagactt 
ctctttaagc 
gtgtcaagaa 
cgaatttgtt 
agcaagagct 
tgggagaaga 
tcctcatagt 
tatatcagtg 
aatgattaag 
ctctttcaag 
tgtatgcaca 
aaacttatga 
tagatcaatt 
atgttctaaa 
tttaacaagg 
ctaatcttct 
gtttggaaaa 
aaattaagag 
tattattact 



aacatacctg 
gtttctcctg 
tgtgtatttt 
ttttctgtta 
ttagtatctt 
aaattgtatt 
aaaagctctg 
tatatttgca 



gttgtacaac 
ctctgtccat 
gcaagtcaga 
ttttcaattt 
gcagctagtt 
taggaagtgt 
tttagcacat 
aaataaattt 



catgctttta 
tctttcactg 
agtctcccag 
cctaagcttt 
atattagttt 
tgcttggcaa 
aactaagttg 
tttttcatat 
aatatgggtt 
ctgtgtcaat 
caagaccagc 
caggcatggt 
gatcgcttca 
gcctaggtaa 
aaatttgctt 
gagagagaaa 
taattttaac 
cttggcagat 
ggcttgcttt 
actcacactt 
tatagagact 
tacaacattc 
attatgaagt 
tcgatactat 
cctgtagcaa 
gtatactata 
acttcaaata 



acacagaaaa 
agcagaactt 
ttgtctgtgc 
tatcaaaaga 
tgagagcctg 
tgatttcaca 
gctctggtcc 
gctaggaaga 
ccttgtgagt 
gttcaccttc 
tatttcccat 
gtgttccagg 
cctttcagtg 
cctttaagtt 
gcccacatta 
ggaaaattaa 
catcatcttt 
catccatgga 
aaatttcctc 
ccatgtcttt 
tttgaacttt 
tagcctgacc 
cacaaaaggt 
ttagaagctt 
ccgtatacct 
gatttacagc 
tgtttttcag 
tctctaatgg 
tacataaaat 
gagcaaagtt 
ggttatataa 
ttctgtaatc 
gtgaataaaa 
gtttcatttc 
acataagctt 
caagtgatct 
ctgggcaaca 
ggcgtacact 
gcccaggagg 
cagcacgaga 
attatagaca 
tgtctaccaa 
cttaatttgt 
attccgtatc 
ataaacaaga 
tttgattaaa 
atgtaacatg 
acatacttgt 
gctcgtctgt 
catcaatatt 
ctggggagtc 
ataatagctg 
gaaagctgaa 



tggtataaaa 
taaatttccc 
cataagtgct 
gtcggtgtga 
cagacattct 
gaatattcat 
ttcccttgga 
aatgca^agc 
gggcttcacc 
tcaagccagt 
taggtttagt 
aaccagatca 
tgaactgttc 
gacatttctg 
aacataccag 
gtgactgtgt 
gaagagtcgt 
cactcaggat 
atgctgaatt 
gttaagtaca 
gagaggcaat 
tgcataagat 
ctataaaata 



ctccttaacc 

acctgtaagt 

ttagttatta 

aaaagggtgt 

ctgtgctgtt 

aggcttaatg 

ttaaaaaaat 

attcatctgc 

attaagtgat 

gtgccgatct 

catgtgattt 

tattttttcc 

aactagactg 

tatcgagaac 

gagtagtttg 

ttgagattgc 

ccccaactct 

cacagtaact 

agcagtattt 

ttttagtagt 

tggtggaaag 

ttttttctcc 

gaacttgaaa 

caatcattag 

caaatattca 

acaatcgcta 

tgacatcttt 

atatatgagg 

gttatcctga 

gtacttctaa 



1096 

1156 

1216 

1276 

1336 

1396 

1456 

1516 

1576 

1636 

1696 

1756 

1816 

1876 

1936 

1996 

2056 

2116 

2176 

2236 

2296 

2356 

2416 

2476 

2536 

2596 

2656 

2716 

2776 

2836 

2896 

2956 

3016 

3076 

3136 

3196 

3256 

3316 

3376 

3436 

3496 

3556 

3616 

3676 

3736 

3796 

3856 

3916 

3976 

4036 

4096 

4156 

4216 

4276 

4336 

4396 

4456 

4516 

4576 

4636 

4696 
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tatactgagg gaagtataat atgtggaaca aactctcaac aaaatgttta ttgatgttga 4756 

tgaaacagat cagtttttcc atccggatta ttattggttc atgattttat atgtgaatat 4816 

gtaagatatg ttctgcaatt ttataaatgt tcatgtcttt ttttaaaaaa ggtgctattg 4876 

aaattctgtg tctccagcag gcaagaatac ttgactaact ctttttgtct ctttatggta 493 6 

ttttcagaat aaagtctgac ttgtgttttt gagattattg gtgcctcatt aattcagcaa 4996 

taaaggaaaa tatgcatctc aaaaat 5022 

<210> 116 

<211> 4932 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 
<222> 31. .33 
<223> ATG 

<221> misc_f€ature 
<222> 814. .816 
<223> TAG : 
<221> polyA-siignal 
<222> 4905 . ,4j910 
<223> AATAAA 
<400> 116 

ctgctgtccc tg^tgctcca cacgtactcc atg cgc tac ctg ctg ccc age gtc 54 

Met hrg Tyr Leu Leu Pro Ser Val 
1 5 

gtg etc ctg ggc acg gcg ccc ace tac gtg ttg gcc tgg ggg gtc tgg 102 
Val Leu Leu G*[Ly Thr Ala Pro Thr Tyr Val Leu Ala Trp Gly Val Trp 

10 i 15 20 

egg ctg etc tec gcc ttc ctg ccc gcc cgc ttc tac caa gcg ctg gac 150 
Arg Leu Leu Ser Ala Phe Leu Pro Ala Arg Phe Tyr Gin Ala Leu Asp 
25 30 35 40 

gac egg ctg tac tgc gtc tac cag age atg gtg etc ttc ttc ttc gag 198 
Asp Arg Leu Tyr Cys Val Tyr Gin Ser Met Val Leu Phe Phe Phe Glu 

45 50 55 

aat tac ace ggg gtc cag atg tat ctt gtg att ttt cca gaa ggt aca 246 
Asn Tyr Thr Gly Val Gin Met Tyr Leu Val He Phe Pro Glu Gly Thr 

60 65 70 

agg tat aat cca gag caa aca aaa gtc ctt tea get agt cag gca ttt 294 
Arg Tyr Asn Pro Glu Gin Thr Lys Val Leu Ser Ala Ser Gin Ala Phe 

75 80 85 

get gcc caa egt ggc ctt gca gta tta aaa cat gtg eta aca cca cga 342 
Ala Ala Gin Arg Gly Leu Ala Val Leu Lys His Val Leu Thr Pro Arg 

90 95 100 

ata aag gca act cac gtt get ttt gat tgc atg aag aat tat tta gat 
He Lys Ala Thr His Val Ala Phe Asp Cys Met Lys Asn Tyr Leu Asp 
105 110 115 120 

gca att tat gat gtt acg gtg gtt tat gaa ggg aaa gac gat gga ggg 
Ala He Tyr Asp Val Thr Val Val Tyr Glu Gly Lys Asp Asp Gly Gly 

125 130 135 

cag cga aga gag tea ccg acc atg acg gaa ttt etc tgc aaa gaa tgt 486 
Gin Arg Arg G{Lu Ser Pro Thr Met Thr Glu Phe Leu Cys Lys Glu Cys 

l£o 145 150 

cca aaa att cat att cac att gat cgt ate gac aaa aaa gat gtc cca 534 
Pro Lys He His He His He Asp Arg He Asp Lys Lys Asp Val Pro 

155 160 165 

gaa gaa caa gaa cat atg aga aga tgg ctg cat gaa cgt ttc gaa ate 582 
Glu Glu Gin Glu His Met Arg Arg Trp Leu His Glu Arg Phe Glu He 

170 175 180 

aaa gat aag atg ctt ata gaa ttt tat gag tea cca gat cca gaa aga 630 
Lys Asp Lys Met Leu He Glu Phe Tyr Glu Ser Pro Asp Pro Glu Arg 
185 190 195 200 

aga aaa aga ttt ect ggg aaa agt gtt aat tec aaa tta agt ate aag 678 



390 



438 
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Arg Lys Arg Phe Pro Gly Lys Ser Val Asn Ser Lys Leu Ser lie Lys 

205 210. 215 

aag act tta cca tea atg ttg ate tta agt ggt ttg act gca ggc atg 
Lys Thr Leu Pro Ser Met Leu He Leu Ser Gly Leu Thr Ala Gly Met 

220 225 230 

ctt atg acc gat get gga agg aag ctg tat gtg aac acc tgg ata tat 
Leu Met Thr Asp Ala Gly Arg Lys Leu Tyr Val Asn Thr Trp He Tyr 

235 240 245 

gga acc eta ctt ggc tgc ctg tgg gtt act att aaa gca tag 
Gly Thr Leu Liau Gly Cys Leu Trp Val Thr He Lys Ala * 

250 i 255 260 

acaagtagct gtctecagac agtgggatgt getacattgt ctatttttgg cggetgeaca 
tgaeatcaaa tt^tttcctg aatttattaa ggagtgtaaa taaagccttg ttgattgaag 
attggataat agkatttgtg acgaaagetg atatgcaatg gtcttgggca aacatacctg 
gttgtacaac ttbagcatcg gggctgctgg aagggtaaaa gctaaatgga gtttctcctg 
ctctgtccat ttcctatgaa ctaatgaeaa ettgagaagg ctgggaggat tgtgtatttt 
gcaagtcaga tg^ctgeatt tttgagcatt aatttgcagc gtatttcact ttttctgtta 
ttttcaattt attacaaett gacagcteca agctcttatt aetaaagtat ttagtatett 
geagctagtt aajtatttcat cttttgctta tttctacaag tcagtgaaat aaattgtatt 
taggaagtgt ca^gatgttc aaaggaaagg gtaaaaagtg ttcatgggga aaaagctctg 
tttagcacat gattttattg tattgcgtta ttagetgatt ttactcattt tatatttgca 
aaataaattt etaatattta ttgaaattgc ttaatttgca caccctgtac acacagaaaa 
tggtataaaa tatgagaacg aagtttaaaa ttgtgactct gattcattat agcagaaett 
taaatttccc agctttttga agatttaage taegctatta gtacttcect ttgtctgtgc 
cataagtgct tgaaaacgtt aaggttttct gttttgtttt gtttttttaa tateaaaaga 
gtcggtgtga accttggttg gaecccaagt tcacaagatt tttaaggtga tgagagcctg 
cagacattct gcctagattt actagcgtgt gccttttgcc tgettctctt tgatttcaca 
gaatattcat tcagaagtcg cgtttetgta gtgtggtgga ttcccactgg gctctggtcc 
ttcccttgga tcccgtcagt ggtgctgctc agcggcttgc acgtagactt gctaggaaga 
aatgcagagc cagcctgtgc tgcccacttt cagagttgaa ctctttaagc ccttgtgagt 
gggetteacc agctaetgca gaggcatttt gcatttgtct gtgtcaagaa gttcaccttc 
tcaagccagt gaaatacaga cttaattegt catgactgaa cgaatttgtt tatttcccat 
taggtttagt ggagctacac attaatatgt atcgccttag agcaagagct gtgttccagg 
aaccagatea egatttttag ccatggaaca atatatccca tgggagaaga cctttcagtg 
tgaactgttc tatttttgtg ttataattta aacttcgatt tcetcatagt cctttaagtt 
gacatttctg ctjtactgcta ctggattttt gctgcagaaa tatatcagtg gcccacatta 
aacataccag ttggatcatg ataagcaaaa tgaaagaaat aatgattaag ggaaaattaa 
gtgaetgtgt tabactgctt ctcccatgcc agagaataaa ctctttcaag catcatcttt 
gaagagtcgt gtggtgtgaa ttggtttgtg tacattagaa tgtatgcaca catccatgga 
caetcaggat atagttggec taataatcgg ggcatgggta aaacttatga aaatttccte 
atgctgaatt gtkattttct cttacctgta aagtaaaatt tagatcaatt ccatgtcttt 
gttaagtaea ggbatttaat atattttgaa tataatgggt atgttctaaa tttgaacttt 
gagaggcaat achgttggaa ttatgtggat tctaactcat tttaacaagg tagcetgacc 
tgcataagat catttgaatg ttaggtttca tagaactata etaatcttct cacaaaaggt 
ctataaaata ca^tcgttga aaaaaatttt gtatcaaaat gtttggaaaa ttagaagctt 
ctecttaacc tgtattgata ctgacttgaa ttattttcta aaattaagag ccgtatacct 
acctgtaagt cttttcacat atcatttaaa cttttgtttg tattattact gatttacagc 
ttagttatta atttttcttt ataagaatgc cgtcgatgtg catgctttta tgtttttcag 
aaaagggtgt gtttggatga aagtaaaaaa aaaaataaaa tctttcactg tctctaatgg 
ctgtgctgtt taacattttt tgaccctaaa attcaccaac agtctcccag tacataaaat 
aggcttaatg actggccctg cattctteac aatatttttc ectaagcttt gagcaaagtt 
ttaaaaaaat acactaaaat aateaaaact gttaagcagt atattagttt ggttatataa 
attcatctgc aatttataag atgcatggcc gatgttaatt tgcttggcaa ttctgtaatc 
attaagtgat etcagtgaaa catgtcaaat gccttaaatt aactaagttg gtgaataaaa 
gtgcegatct ggctaactet taeaccatac atactgatag tttttcatat gtttcattte 
catgtgattt ttaaaattta gagtggcaac aattttgctt aatatgggtt acataagctt 
tattttttcc tttgttcata attatattet ttgaataggt ctgtgtcaat caagtgatct 
aactagactg ateatagata gaaggaaata aggccaagtt caagaccagc ctgggcaaca 
tatcgagaac ctgtctacaa aaaaattaaa aaaaattage caggcatggt ggegtacact 
gagtagtttg tcccagctac tcgggagggt gaggtgggag gatcgcttca gcccaggagg 
ttgagattgc agjtgagccat ggacatacca ctgeactaca geetaggtaa cageacgaga 



726 



774 



816 



876 
936 
996 
1056 
1116 
1176 
1236 
1296 
1356 
1416 
1476 
1536 
1596 
1656 
1716 
1776 
1836 
1896 
1956 
2016 
2076 
2136 
2196 
2256 
2316 
2376 
2436 
2496 
2556 
2616 
2676 
2736 
2796 
2856 
2916 
2976 
3036 
3096 
3156 
3216 
3276 
3336 
3396 
3456 
3516 
3576 
3636 
3696 
3756 
3816 
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ccccaactct tagaaaatga aaaggaaata tagaaatata aaatttgctt attatagaca 3876 
cacagtaact cccagatatg taccacaaaa aatgtgaaaa gagagagaaa tgtctaccaa 3936 
agcagtattt tgtgtgtata attgcaagcg catagtaaaa taattttaac cttaatttgt 3996 
ttttagtagt gtttagattg aagattgagt gaaatatttt cttggcagat attccgtatc 4056 
tggtggaaag ctacaatgca atgtcgttgt agttttgcat ggcttgcttt ataaacaaga 4116 
ttttttctcc ctccttttgg gccagttttc attacgagta actcacactt tttgattaaa 4176 
gaacttgaaa ttacgttatc acttagtata attgacatta tatagagact atgtaacatg 4236 
caatcattag aajtcaaaatt agtactttgg tcaaaatatt tacaacattc acatacttgt 4296 
caaatattca tg|taattaac tgaatttaaa accttcaact attatgaagt gctcgtctgt 4356 
acaatcgcta atlttactcag tttagagtag ctacaactct tcgatactat catcaatatt 4416 
tgacatcttt tcpaatttgt gtatgaaaag taaatctatt cctgtagcaa ctggggagtc 4476 
atatatgagg tcaaagacat ataccttgtt attataatat gtatactata ataatagctg 4536 
gttatcctga gckggggaaa aggttatttt taggaaaacc acttcaaata gaaagctgaa 4596 
gtacttctaa tatactgagg gaagtataat atgtggaaca aactctcaac aaaatgttta 4656 
ttgatgttga tgaaacagat cagtttttcc atccggatta ttattggttc atgattttat 4716 
atgtgaatat gtaagatatg ttctgcaatt ttataaatgt tcatgtcttt ttttaaaaaa 4776' 
ggtgctattg aaattctgtg tctccagcag gcaagaatac ttgactaact ctttttgtct 483 6 
ctttatggta, ttttcagaat aaagtctgac ttgtgttttt gagattattg gtgcctcatt 4896 
aattcagcaa taaaggaaaa tatgcatctc aaaaat 4932 
<210> 117 
<211> 4682 
<212> DNA. 
<213> Homo sapiens 
<220> 

<221> inisc_feature 
<222> 31. .33 
<223> ATG 

<221> misc^feature 
<222> 301.. 303 
<223> TGA j 
<221> polyA_sp.gnal 
<222> 4655. ,41560 
<223> AATAAA ' 
<400> 117 

ctgctgtccc tggtgctcca cacgtactcc atg cgc tac ctg ctg ccc age gtc 54 
I Met Arg Tyr Leu Leu Pro Ser Val 

1 5 



gtg 


etc 


ctg g^c acg gcg 


ccc 


ace 


tac 


gtg 


ttg 


gee 


tgg 


ggg 


gtc 


tgg 


102 


Val 


Leu 


Leu 


GjLy Thr Ala 


Pro 


Thr 


Tyr 


Val 


Leu 


Ala 


Trp 


Gly Val 


Trp 






10 






15 










20 












egg 


ctg 


etc 


tec gee ttc 


ctg 


ccc 


gcc 


cgc 


ttc 


tac 


caa 


gcg 


ctg 


gac 


150 


Arg 


Leu 


Leu 


Ser Ala Phe 


Leu 


Pro 


Ala 


Arg 


Phe 


Tyr 


Gin 


Ala 


Leu Asp 




25 






30 










35 










40 




gac 


egg 


ctg 


tac tgc gtc 


tac 


cag 


age 


atg 


gtg 


etc 


ttc 


ttc 


ttc 


gag 


198 


Asp 


Arg 


Leu Tyr Cys Val 


Tyr Gin Ser Met Val 


Leu 


Phe 


Phe 


Phe 


Glu 






45 








50 










55 






aat 


tac 


acc 


ggg gtc cag 


aat 


ttc 


tct 


gea 


aag 


aat 


gtc 


caa 


aaa 


ttc 


246 


Asn 


Tyr 


Thr 


Gly Val Gin 


Asn 


Phe 


Ser 


Ala 


Lys 


Asn 


Val 


Gin 


Lys 


Phe 








60 






65 










70 






294 


ata 


ttc 


aca 


ttg ate gta 


teg 


aca 


aaa 


aag 


atg 


tec 


cag 


aag 


aac 


aag 


He 


Phe 


Thr 


Leu He Val 


Ser 


Thr 


Lys 


Lys 


Met 


Ser 


Gin Lys 


Asn Lys 








75 






80 










85 










aac 


ata 


tga 


gaagatggct ^ 


gcatgaaegt ttcgaaatca aagataagat 






343 


Asn 


He 
90 


* 



























gcttatagaa ttittatgagt caccagatcc agaaagaaga aaaagatttc ctgggaaaag 403 

tgttaattce aa^ttaagta tcaagaagac tttaeeatca atgttgatct taagtggttt 463 

gactgcaggc atjgcttatga cegatgetgg aaggaagctg tatgtgaaca cctggatata 523 

tggaacccta ctjtggctgcc tgtgggttac tattaaagca tagacaagta gctgtctcca 583 

gacagtggga tgltgctacat tgtctatttt tggeggctgc acatgacatc aaattgtttc 643 

ctgaatttat taaggagtgt aaataaagcc ttgttgattg aagattggat aatagaattt 703 
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gtgacgaaag ctgatatgca atggtcttgg gcaaacatac 
tcggggctgc tggaagggta aaagctaaat ggagtttctc 
gaactaatga cakcttgaga aggctgggag gattgtgtat 
atttttgagc atbaatttgc agcgtatttc actttttctg 
cttgacagct cc^agctctt attactaaag tatttagtat 
catcttttgc ttktttctac aagtcagtga aataaattgt 
ttcaaaggaa agpgtaaaaa gtgttcatgg ggaaaaagct 
ttgtattgcg ttattagctg attttactca ttttatattt 
ttattgaaat tgcttaattt gcacaccctg tacacacaga 
acgaagttta aaattgtgac tctgattcat tatagcagaa 
tgaagattta agctacgcta ttagtacttc cctttgtctg 
gttaaggttt tctgttttgt tttgtttttt taatatcaaa 
ttggacccca agttcacaag atttttaagg tgatgagagc 
tttactagcg tgtgcctttt gcctgcttct ctttgatttc 
tcgcgtttct gtagtgtggt ggattcccac tgggctctgg 
agtggtgctg ctcagcggct tgcacgtaga cttgctagga 
tgctgcccac tttcagagtt gaactcttta agcccttgtg 
gcagaggcat tttgcatttg tctgtgtcaa gaagttcacc 
agacttaatt cgtcatgact gaacgaattt gtttatttcc 
. cacattaata tgtatcgcct tagagcaaga gctgtgttcc 
tagccatgga acaatatatc ccatgggaga agacctttca 
gtgttataat ttaaacttcg atttcctcat agtcctttaa 
ctactggatt tttgctgcag aaatatatca gtggcccaca 
atgataagca aaatgaaaga aataatgatt aagggaaaat 
cttctcccat gccagagaat aaactctttc aagcatcatc 
gaattggttt gtgtacatta gaatgtatgc acacatccat 
gcctaataat cggggcatgg gtaaaactta tgaaaatttc 
tctcttacct gtaaagtaaa atttagatca attccatgtc 
aatatatttt gaatataatg ggtatgttct aaatttgaac 
gaattatgtg gattctaact cattttaaca aggtagcctg 
atgttaggtt tcatagaact atactaatct tctcacaaaa 
tgaaaaaaat tttgtatcaa aatgtttgga aaattagaag 
atactgactt gaattatttt ctaaaattaa gagccgtata 
catatcattt aaacttttgt ttgtattatt actgatttac 
tttataagaa tgccgtcgat gtgcatgctt ttatgttttt 
tgaaagtaaa aaaaaaaata aaatctttca ctgtctctaa 
ttttgaccct aaaattcacc aacagtctcc cagtacataa 
ctgcattctt cacaatattt ttccctaagc tttgagcaaa 
aataatcaaa actgttaagc agtatattag tttggttata 
aagatgcatg gccgatgtta atttgcttgg caattctgta 
aaacatgtca aatgccttaa attaactaag ttggtgaata 
tcttacacca tacatactga tagtttttca tatgtttcat 
ttagagtggc aacaattttg cttaatatgg gttacataag 
ataattatat tctttgaata ggtctgtgtc aatcaagtga 
atagaaggaa ataaggccaa gttcaagacc agcctgggca 
caaaaaaatt aaaaaaaatt agccaggcat ggtggcgtac 
tactcgggag ggtgaggtgg gaggatcgct tcagcccagg 
catggacata ccactgcact acagcctagg taacagcacg 
tgaaaaggaa atatagaaat ataaaatttg cttattatag 
atgtaccaca aa^aatgtga aaagagagag aaatgtctac 
ataattgcaa gcgcatagta aaataatttt aaccttaatt 
ttgaagattg agtgaaatat tttcttggca gatattccgt 
gcaatgtcgt tgitagttttg catggcttgc tttataaaca 
tgggccagtt ttcattacga gtaactcaca ctttttgatt 
atcacttagt ataattgaca ttatatagag actatgtaac 
attagtactt tggtcaaaat atttacaaca ttcacatact 
aactgaattt aaaaccttca actattatga agtgctcgtc 
cagtttagag tagctacaac tcttcgatac tatcatcaat 
tgtgtatgaa aagtaaatct attcctgtag caactgggga 
catatacctt gttattataa tatgtatact ataataatag 
aaaaggttat ttttaggaaa accacttcaa atagaaagct 



ctggttgtac 
ctgctctgtc 
tttgcaagtc 
ttattttcaa 
cttgcagcta 
atttaggaag 
ctgtttagca 
gcaaaataaa 
aaatggtata 
ctttaaattt 
tgccataagt 
agagtcggtg 
ctgcagacat 
acagaatatt 
tccttccctt 
agaaatgcag 
agtgggcttc 
ttctcaagcc 
cattaggttt 
aggaaccaga 
gtgtgaactg 
gttgacattt 
ttaaacatac 
taagtgactg 
tttgaagagt 
ggacactcag 
ctcatgctga 
tttgttaagt 
tttgagaggc 
acctgcataa 
ggtctataaa 
cttctcctta 
cctacctgta 
agcttagtta 
cagaaaaggg 
tggctgtgct 
aataggctta 
gttttaaaaa 
taaattcatc 
atcattaagt 
aaagtgccga 
ttccatgtga 
ctttattttt 
tctaactaga 
acatatcgag 
actgagtagt 
aggttgagat 
agaccccaac 
acacacagta 
caaagcagta 
tgtttttagt 
atctggtgga 
agattttttc 
aaagaacttg 
atgcaatcat 
tgtcaaatat 
tgtacaatcg 
atttgacatc 
gtcatatatg 
ctggttatcc 
gaagtacttc 



aactttagca 
catttcctat 



agatggctgc 
tttattacaa 
gttaatattt 
tgtcaggatg 
catgatttta 
tttctaatat 
aaatatgaga 
cccagctttt 
gcttgaaaac 
tgaaccttgg 
tctgcctaga 
cattcagaag 
ggatcccgtc 
agcdagcctg 
accagctact 
agtgaaatac 
agtggagcta 
tcacgatttt 
ttctattttt 
ctgcttactg 
cagttggatc 
tgttacactg 
cgtgtggtgt 
gatatagttg 
attgtaattt 
acagggattt 
aatactgttg 
gatcacttga 
atacagtcgt 
acctgtattg 
agtcttttca 
ttaatttttc 
tgtgtttgga 
gtttaacatt 
atgactggcc 
aatacactaa 
tgcaatttat 
gatctcagtg 
tctggctaac 
tttttaaaat 
tcctttgttc 
ctgatcatag 
aacctgtcta 
ttgtcccagc 
tgcagtgagc 
tcttagaaaa 
actcccagat 
ttttgtgtgt 
agtgtttaga 
aagctacaat 
tccctccttt 
aaattacgtt 
tagaatcaaa 
tcatgtaatt 
ctaatttact 
ttttccaatt 
aggtcaaaga 
tgagcagggg 
taatatactg 



763 
823 
883 
943 
1003 
1063 
1123 
1183 
1243 
1303 
1363 
1423 
1483 
1543 
1603 
1663 
1723 
1783 
1843 
1903 
1963 
2023 
2083 
2143 
2203 
2263 
2323 
2383 
2443 
2503 
2563 
2623 
2683 
2743 
2803 
2863 
2923 
2983 
3043 
3103 
3163 
3223 
3283 
3343 
3403 
3463 
3523 
3583 
3643 
3703 
3763 
3823 
3883 
3943 
4003 
4063 
4123 
4183 
4243 
4303 
4363 
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agggaagtat aatatgtgga acaaactctc aacaaaatgt ttattgatgt tgatgaaaca 

gatcagtttt tccatccgga ttattattgg ttcatgattt tatatgtgaa tatgtaagat 

atgttctgca attttataaa tgttcatgtc tttttttaaa aaaggtgcta ttgaaattct 

gtgtctccag caggcaagaa tacttgacta actctttttg tctctttatg gtattttcag 

aataaagtct gacttgtgtt tttgagatta ttggtgcctc attaattcag caataaagga 

aaatatgcat ctcaaaaat 

<210> 118 

<211> 4558 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> raisc_f eature 
<222> 31. .33 
<223> ATG 

<221> misc_f eature 
<222> 235. 
<223> TGA i 
<221> polyA_sp.gnal 
<222> 4531. .4536 
<223> AATAAA • 
<400> 118 

ctgctgtccc tggtgctcca cacgtactcc atg cgc tac ctg ctg ccc age gtc 

Met Arg Tyr Leu Leu Pro Ser Val 
1 5 
gtg etc ctg ggc acg gcg ccc acc tac gtg ttg gcc tgg ggg gtc tgg 
Val Leu Leu Gly Thr Ala Pro Thr Tyr Val Leu Ala Trp Gly Val Trp 



15 20 
ttc ctg ccc gcc cgc ttc tac caa gcg ctg gac 



4423 
4483 
4543 
4603 
4663 
4682 



10 

egg ctg etc tec gcc _ _ ^ 

Arg Leu Leu Ser Ala Phe Leu Pro Ala Arg Phe Tyr Gin Ala Leu Asp 
25 30 35 40 

gac egg ctg tac tgc gtc tac cag age atg gtg etc ttc ttc ttc gag 
Asp Arg Leu Tyr Cys Val Tyr Gin Ser Met Val Leu Phe Phe Phe Glu 

45 50 55 

aat tac acc ggg gtc cag gat get tat aga att tta tga gteaccagat 
Asn Tyr Thr Gly Val Gin Asp Ala Tyr Arg lie Leu * 

60 65 
ceagaaagaa gaaaaagatt tectgggaaa agtgttaatt ccaaattaag tatcaagaag 
aetttaecat caatgttgat cttaagtggt ttgaetgcag gcatgettat gaccgatget 
ggaaggaagc tgtatgtgaa cacctggata tatggaaccc tacttggctg ectgtgggtt 
actattaaag catagacaag tagctgtctc cagacagtgg gatgtgctac attgtctatt 
tttggcggct gcacatgaca teaaattgtt tcctgaattt attaaggagt gtaaataaag 
ccttgttgat tgaagattgg ataatagaat ttgtgacgaa agctgatatg caatggtctt 
gggcaaacat aebtggttgt acaactttag catcggggct gctggaaggg taaaagctaa 
atggagtttc te^tgctctg tccatttcet atgaactaat gacaacttga gaaggetggg 
aggattgtgt attttgeaag teagatgget gcatttttga geattaattt gcagcgtatt 
tcactttttc tgjttattttc aatttattac aacttgacag ctccaagctc ttattactaa 
agtatttagt aticttgcagc tagttaatat ttcatctttt gcttatttct acaagtcagt 
gaaataaatt gtktttagga agtgtcagga tgttcaaagg aaagggtaaa aagtgttcat 
ggggaaaaag ctfctgtttag cacatgattt tattgtattg cgttattagc tgattttact 
cattttatat tt&caaaata aatttctaat atttattgaa attgcttaat ttgeacaccc 
tgtacacaca gakaatggta taaaatatga gaacgaagtt taaaattgtg aetctgattc 
attatagcag aactttaaat ttcccagctt tttgaagatt taagctacgc tattagtact 
tccctttgtc tgtgccataa gtgcttgaaa acgttaaggt tttctgtttt gttttgtttt 
tttaatatca aaagagtcgg tgtgaacctt ggttggaccc caagttcaca agatttttaa 
ggtgatgaga gcctgcagac attctgccta gatttactag cgtgtgcctt ttgectgctt 
ctctttgatt tcacagaata ttcattcaga agtcgcgttt ctgtagtgtg gtggattccc 
actgggctct ggtccttccc ttggatcccg tcagtggtgc tgctcagegg ettgcacgta 
gacttgctag gaagaaatgc agagccagcc tgtgctgccc actttcagag ttgaaetctt 
taagcccttg tgagtgggct teaccagcta ctgcagaggc attttgcatt tgtctgtgtc 
aagaagttca cettctcaag ceagtgaaat acagacttaa ttcgtcatga etgaaegaat 
ttgtttattt eccattaggt ttagtggagc tacacattaa tatgtategc cttagagcaa 



54 



102 



150 



198 



247 



307 
367 
427 
487 
547 
607 
667 
727 
787 
847 
907 
967 
1027 
1087 
1147 
1207 
1267 
1327 
1387 
1447 
1507 
1567 
1627 
1687 
1747 
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gagctgtgtt ccaggaacca gatcacgatt tttagccatg gaacaatata tcccatggga 1807 

gaagaccttt cagtgtgaac tgttctattt ttgtgttata atttaaactt cgatttcctc 1867 

atagtccttt aagttgacat ttctgcttac tgctactgga tttttgctgc agaaatatat 1927 

cagtggccca cattaaacat accagttgga tcatgataag caaaatgaaa gaaataatga 1987 

ttaagggaaa attaagtgac tgtgttacac tgcttctccc atgccagaga ataaactctt 2047 

tcaagcatca tctttgaaga gtcgtgtggt gtgaattggt ttgtgtacat tagaatgtat 2107 

gcacacatcc atggacactc aggatatagt tggcctaata atcggggcat gggtaaaact 2167 

tatgaaaatt tcctcatgct gaattgtaat tttctcttac ctgtaaagta aaatttagat 2227 

caattccatg tctttgttaa gtacagggat ttaatatatt ttgaatataa tgggtatgtt 2287 

ctaaatttga actttgagag gcaatactgt tggaattatg tggattctaa ptcattttaa 2347 

caaggtagcc tgkcctgcat aagatcactt gaatgttagg tttcatagaa ctatactaat 2407 

cttctcacaa aa^gtctata aaatacagtc gttgaaaaaa attttgtatc aaaatgtttg 2467 

gaaaattaga agbttctcct taacctgtat tgatactgac ttgaattatt ttctaaaatt 2527 

aagagccgta tajrctacctg taagtctttt cacatatcat ttaaactttt gtttgtatta 2587 

ttactgattt acpgcttagt tattaatttt tctttataag aatgccgtcg atgtgcatgc 2647 

ttttatgttt ttcagaaaag ggtgtgtttg gatgaaagta- aaaaaaaaaa taaaatcttt-' 2707 

cactgtctct aajtggctgtg ctgtttaaca ttttttgacc ctaaaattca ccaacagtct 2767 

cccagtacat aakataggct taatgactgg ccctgcattc ttcacaatat ttttccctaa 2827 

gctttgagca aa^ttttaaa aaaatacact aaaataatca aaactgttaa gcagtatatt 2887 

agtttggtta tataaattca tctgcaattt ataagatgca tggccgatgt taatttgctt 2947 

ggcaattctg taatcattaa gtgatctcag tgaaacatgt caaatgcctt aaattaacta 3007 

agttggtgaa taaaagtgcc gatctggcta actcttacac catacatact gatagttttt 3067 

catatgtttc atttccatgt gatttttaaa atttagagtg gcaacaattt tgcttaatat 3127 

gggttacata agctttattt tttcctttgt tcataattat attctttgaa taggtctgtg 3187 

tcaatcaagt gatctaacta gactgatcat agatagaagg aaaitaaggcc aagttcaaga 3247 

ccagcctggg caacatatcg agaacctgtc tacaaaaaaa ttaaaaaaaa ttagccaggc 3307 

atggtggcgt acactgagta gtttgtccca gctactcggg agggtgaggt gggaggatcg 3367 

cttcagccca ggaggttgag attgcagtga gccatggaca taccactgca ctacagccta 3427 

ggtaacagca cgagacccca actcttagaa aatgaaaagg aaatatagaa atataaaatt 3487 

tgcttattat agacacacag taactcccag atatgtacca caaaaaatgt gaaaagagag 3547 

agaaatgtct accaaagcag tattttgtgt gtataattgc aagcgcatag taaaataatt 3607 

ttaaccttaa tttgttttta gtagtgttta gattgaagat tgagtgaaat attttcttgg 3667 

cagatattcc gtatctggtg gaaagctaca atgcaatgtc gttgtagttt tgcatggctt 3727 

gctttataaa caagattttt tctccctcct tttgggccag ttttcattac gagtaactca 3787 

cactttttga ttaaagaact tgaaattacg ttatcactta gtataattga cattatatag 3847 

agactatgta acatgcaatc attagaatca aaattagtac tttggtcaaa atatttacaa 3907 

cattcacata ct[:gtcaaat attcatgtaa ttaactgaat ttaaaacctt caactattat 3967 

gaagtgctcg tctgtacaat cgctaattta ctcagtttag agtagctaca actcttcgat 4027 

actatcatca atatttgaca tcttttccaa tttgtgtatg aaaagtaaat ctattcctgt 4087 

agcaactggg ga^tcatata tgaggtcaaa gacatatacc ttgttattat aatatgtata 4147 

ctataataat agptggttat cctgagcagg ggaaaaggtt atttttagga aaaccacttc 4207 

aaatagaaag ctgaagtact tctaatatac tgagggaagt ataatatgtg gaacaaactc 4267 

tcaacaaaat gtptattgat gttgatgaaa cagatcagtt tttccatccg gattattatt 4327 

ggttcatgat tttatatgtg aatatgtaag atatgttctg caattttata aatgttcatg 4387 

tcttttttta aakaaggtgc tattgaaatt ctgtgtctcc agcaggcaag aatacttgac 4447 

taactctttt tgtctcttta tggtattttc agaataaagt ctgacttgtg tttttgagat 4507 

tattggtgcc tcattaattc agcaataaag gaaaatatgc atctcaaaaa t 4558 
<210> 119 
<211> 5270 
<212> DNA 
<213> Homo sapiens 
<220> 

<221> inisc_feature 
<222> 31. .33 
<223> ATG 

<221> inisc_feature 
<222> 229.. 231 
<223> TAG 

<221> polyA_signal 
<222> 5243. .5248 
<223> AATAAA ; 
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<400> 119 

ctgctgtccc tggtgctcca cacgtactcc atg cgc tac ctg ctg ccc age gtc 

Met Arg Tyr Leu Leu Pro Ser Val 
1 5 
gtg etc ctg gbc acg gcg ccc acc tac gtg ttg gcc tgg ggg gtc tgg 
Val Leu Leu Gly Thr Ala Pro Thr Tyr Val Leu Ala Trp Gly Val Trp 

10 1 15 20 

egg ctg etc tec gee ttc ctg ecc gee cgc ttc tac caa gcg ctg gae 
Arg Leu Leu Sisr Ala Phe Leu Pro Ala Arg Phe Tyr Gin Ala Leu Asp 
25 ! 30 35 40 

gae egg ctg t^e tgc gtc tac cag age atg gtg etc ttc ttc ttc gag 
Asp Arg Leu Tyr Cys Val Tyr Gin Ser Met Val Leu Phe Phe Phe Glu 

r 45 50 55 

aat tac aee ggg gtc cag aga ttg gat teg tag attaaacttg agaaacaaac 
Asn Tyr Thr Gly Val Gin Arg Leu Asp Ser * 

60 ^65 
cataaaagtg gaaggccctc tttaacaata ttgctatatg gagatttgee aaaaaataaa 
gaaaatataa tatatttage aaatcatcaa agcacagttg actggattgt tgctgacatc 
ttggeeatca ggcagaatgc gctaggacat gtgcgctacg tgetgaaaga agggttaaaa 
tggctgceat tgtatgggtg ttaetttget cagcatggag gaatctatgt aaagcgcagt 
gccaaattta acgagaaaga gatgcgaaac aagttgcaga getacgtgga egcaggaaet 
ccaatgtatc ttgtgatttt tccagaaggt acaaggtata atccagagea aacaaaagtc 
ettteageta gteaggeatt tgctgeccaa cgtggeettg eagtattaaa aeatgtgcta 
acaccaegaa taaaggcaac teacgttgct tttgattgea tgaagaatta tttagatgea 
atttatgatg ttacggtggt ttatgaaggg aaagacgatg gagggeagcg aagagagtea 
ecgaccatga cggaatttct ctgcaaagaa tgtccaaaaa tteatattca cattgatcgt 
ategacaaaa aagatgteec agaagaacaa gaacatatga gaagatgget geatgaaegt 
ttcgaaatca aagataagat gcttatagaa ttttatgagt eaccagatec agaaagaaga 
aaaagattte ctgggaaaag tgttaattcc aaattaagta teaagaagac tttaccatea 
atgttgatct takgtggttt gactgcaggc atgettatga ccgatgetgg aaggaagctg 
tatgtgaaca eefeggatata tggaacceta cttggctgce tgtgggttac tattaaagca 
tagacaagta gepgtctcca gaeagtggga tgtgetacat tgtctatttt tggcggctge 
aeatgacate aa^ttgtttc ctgaatttat taaggagtgt aaataaagcc ttgttgattg 
aagattggat aatagaattt gtgaegaaag ctgatatgea atggtcttgg gcaaacatae 
ctggttgtac aattttagca tcggggctge tggaagggta aaagctaaat ggagtttctc 
ctgctctgte catttcetat gaactaatga eaaettgaga aggetgggag gattgtgtat 
tttgcaagtc agptggetge atttttgagc attaatttgc agcgtatttc actttttetg 
ttattttcaa tttattacaa cttgacaget ceaagetctt attactaaag tatttagtat 
cttgeagcta gttaatattt catettttgc ttatttetac aagtcagtga aataaattgt 
atttaggaag tgtcaggatg ttcaaaggaa agggtaaaaa gtgttcatgg ggaaaaagct 
ctgtttagca eatgatttta ttgtattgcg ttattagctg attttactca ttttatattt 
geaaaataaa tttetaatat ttattgaaat tgcttaattt gcacaeeetg taeacacaga 
aaatggtata aaatatgaga aegaagttta aaattgtgac tctgattcat tatageagaa 
ctttaaattt cceagetttt tgaagattta agctaegcta ttagtacttc cctttgtetg 
tgeeataagt gettgaaaae gttaaggttt tctgttttgt tttgtttttt taatatcaaa 
agagteggtg tgaaeettgg ttggaceeea agttcaeaag atttttaagg tgatgagage 
ctgcagacat tctgcctaga tttactagcg tgtgcctttt gcctgcttct ctttgattte 
acagaatatt cattcagaag tegegtttct gtagtgtggt ggattcccae tgggctctgg 
tecttceett ggatcccgtc agtggtgetg ctcagcggct tgcacgtaga ettgetagga 
agaaatgcag agecagcetg tgetgcceac tttcagagtt gaactcttta agcecttgtg 
agtgggette aecagctact gcagaggeat tttgcatttg tctgtgtcaa gaagttcacc 
ttctcaagcc agtgaaatac agaettaatt cgteatgact gaaegaattt gtttatttcc 
cattaggttt agtggagcta caeattaata tgtatcgcct tagageaaga getgtgttcc 
aggaaeeaga tcacgatttt tageeatgga acaatatate ecatgggaga agacctttca 
gtgtgaaetg ttctattttt gtgttataat ttaaaetteg atttcctcat agtcctttaa 
gttgaeattt ctgcttactg etactggatt tttgctgeag aaatatatea gtggeccaca 
ttaaacatac cagttggatc atgataagca aaatgaaaga aataatgatt aagggaaaat 
taagtgactg tg::taeactg ettetcecat gecagagaat aaaetetttc aageateatc 
tttgaagagt cgcgtggtgt gaattggttt gtgtacatta gaatgtatgc aeacateeat 
ggaeacteag gaf:atagttg geetaataat cggggeatgg gtaaaaetta tgaaaatttc 
etcatgetga attgtaattt tctcttaect gtaaagtaaa atttagatca attccatgtc 



54 



102 



150 



198 



251 



311 
371 
431 
491 
551 
611 
671 
731 
791 
851 
911 
971 
1031 
1091 
1151 
1211 
1271 
1331 
1391 
1451 
1511 
1571 
1631 
1691 
1751 
1811 
1871 
1931 
1991 
2051 
2111 
2171 
2231 
2291 
2351 
2411 
2471 
2531 
2591 
2651 
2711 
2771 
2831 
2891 
2951 
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tttgttaagt acjagggattt aatatatttt gaatataatg ggtatgttct aaatttgaac 
tttgagaggc aajtactgttg gaattatgtg gattctaact cattttaaca aggtagcctg 
acctgcataa gatcacttga atgttaggtt tcatagaact atactaatct tctcacaaaa 
ggtctataaa atiacagtcgt tgaaaaaaat tttgtatcaa aatgtttgga aaattagaag 
cttctcctta acctgtattg atactgactt gaattatttt ctaaaattaa gagccgtata 
cctacctgta agitcttttca catatcattt aaacttttgt ttgtattatt actgatttac 
agcttagtta tt'aatttttc tttataagaa tgccgtcgat gtgcatgctt ttatgttttt 
cagaaaaggg tgtgtttgga tgaaagtaaa aaaaaaaata aaatctttca ctgtctctaa 
tggctgtgct gtttaacatt ttttgaccct aaaattcacc aacagtctcc cagtacataa 
aataggctta atgactggcc ctgcattctt cacaatattt ttccctaagc . tttgagcaaa 
gttttaaaaa aatacactaa aataatcaaa actgttaagc agtatattag tttggttata 
taaattcatc tgcaatttat aagatgcatg gccgatgtta atttgcttgg caattctgta 
atcattaagt gatctcagtg aaacatgtca aatgccttaa attaactaag ttggtgaata 
aaagtgccga tctggctaac tcttacacca tacatactga tagtttttca tatgtttcat 
ttccatgtga tttttaaaat ttagagtggc aacaattttg cttaatatgg gttacataag 
ctttattttt tcctttgttc ataattatat tctttgaata ggtctgtgtc aatcaagtga^* 
tctaactaga ctgatcatag atagaaggaa ataaggccaa gttcaagacc agcctgggca 
acatatcgag aacctgtcta caaaaaaatt aaaaaaaatt agccaggcat ggtggcgtac 
actgagtagt ttgtcccagc tactcgggag ggtgaggtgg gaggatcgct tcagcccagg 
aggttgagat tgcagtgagc catggacata ccactgcact acagcctagg taacagcacg 
agaccccaac tcttagaaaa tgaaaaggaa atatagaaat ataaaatttg cttattatag 
acacacagta actcccagat atgtaccaca aaaaatgtga aaagagagag aaatgtctac 
caaagcagta ttttgtgtgt ataattgcaa gcgcatagta aaataatttt aaccttaatt 
tgtttttagt agtgtttaga ttgaagattg agtgaaatat tttcttggca gatattccgt 
atctggtgga aagctacaat gcaatgtcgt tgtagttttg catggcttgc tttataaaca 
agattttttc tcbctccttt tgggccagtt ttcattacga gtaactcaca ctttttgatt 
aaagaacttg aaattacgtt atcacttagt ataattgaca ttatatagag actatgtaac 
atgcaatcat tagaatcaaa attagtactt tggtcaaaat atttacaaca ttcacatact 
tgtcaaatat tcatgtaatt aactgaattt aaaaccttca actattatga agtgctcgtc 
tgtacaatcg ctaatttact cagtttagag tagctacaac tcttcgatac tatcatcaat 
atttgacatc ttttccaatt tgtgtatgaa aagtaaatct attcctgtag caactgggga 
gtcatatatg aggtcaaaga catatacctt gttattataa tatgtatact ataataatag 
ctggttatcc tgagcagggg aaaaggttat ttttaggaaa accacttcaa atagaaagct 
gaagtacttc taatatactg agggaagtat aatatgtgga acaaactctc aacaaaatgt 
ttattgatgt tgatgaaaca gatcagtttt tccatccgga ttattattgg ttcatgattt 
tatatgtgaa tatgtaagat atgttctgca attttataaa tgttcatgtc tttttttaaa 
aaaggtgcta ttgaaattct gtgtctccag caggcaagaa tacttgacta actctttttg 
tctctttatg gtattttcag aataaagtct gacttgtgtt tttgagatta ttggtgcctc 
attaattcag caataaagga aaatatgcat ctcaaaaat 
<210> 120 
<211> 5002 
<212> DNA 
<213> Homo sapiens 
<220> 

<221> inisc_feature 
<222> 31. .33 
<223> ATG 

<221> misc_feature 
<222> 322. .324 
<223> TAA 
<221> polyA_signal 
<222> 4975. .4980 
<223> AATAAA ' 
<400> 120 

ctgctgtccc tggtgctcca 



gtg etc ctg ggc acg gcg ccc acc 
Val Leu Leu Gly Thr Ala Pro Thr 

10 15 
egg ctg etc tec gee ttc ctg ccc 



cacgtactcc atg cgc tac ctg ctg ccc age gtc 
Met Arg Tyr Leu Leu Pro Ser Val 
1 5 
tac gtg ttg gcc tgg ggg gtc tgg 
Tyr Val Leu Ala Trp Gly Val Trp 
20 

gcc cgc ttc tac caa gcg ctg gac 



3011 

3071 

3131 

3191 

3251 

3311 

3371 

3431 

3491 

3551 

3611 

3671 

3731 

3791 

3851 

3911 

3971 

4031 

4091 

4151 

4211 

4271 

4331 

4391 

4451 

4511 

4571 

4631 

4691 

4751 

4811 

4871 

4931 

4991 

5051 

5111 

5171 

5231 

5270 



54 



102 



150 
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Arg 








Ala 


Phe 


Leu 


Pro 


Ala 


Arg 


Phe 


Tvr 


Gin 


Ala 


Leu 


Asp 




it J 










30 










35 










40 




cfac 




eta 


tac 


tcrc 


gtc 


tac 


cag 


age 


atg 


gtg 


etc 


ttc 


ttc 


ttc 


gag 


198 


Asp 


Arg 


Leu 


Tyr 


Cys 


Val 


Tyr 


Gin 


Ser 


Met 


Val 


Leu 


Phe 


Phe 


Phe 


Glu 










45 










50 










55 






aat 


tac 


acc 


ggg 


gtc 


cag 


ata 


ttg 


eta 


tat 


gga 


gat 


ttg 


cca 


aaa 


aat 


246 


Asn 


Tyr 


Thr 


Gly 


Val 


Gin 


He 


Leu 


Leu 


Tyr Gly 


Asp 


Leu 


Pro 


Lys 


Asn 








60 










65 










70 








aaa 


gaa 


aat 


ata 


ata 


tat 


tta 


gca 


aat 


cat 


eaa 


age 


aca 


gat 


gta 


tct 


294 


Lys 


Glu 


Asn 


He 


He 


Tyr 


Leu 


Ala 


Asn 


His 


Gin 


Ser 


Thr 


Asp 


Val 


Ser 






75 










80 










85 










tgt 


gat 


ttt 


tec 


aga 


agg 


tac 


aag 


gta 


taa 


tccagagcaa acaaaagtcc 


344 


Cys 


Asp 


Phe 


Ser 


Arg Arg 


Tyr 


Lys 


Val 


* 

















90 95 



tttcagctag tcaggcattt gctgcccaac gtggccttgc agtattaaaa catgtgctaa 404 

caccacgaat aaaggcaact cacgttgctt ttgattgcat gaagaattat ttagatgcaa 464 

tttatgatgt tacggtggtt tatgaaggga aagacgatgg agggcagcga agagagtcac 524 

cgaccatgac gg^atttctc tgcaaagaat gtccaaaaat teatattcac attgatcgta 584 

tcgacaaaaa agatgtccea gaagaaeaag aacatatgag aagatggctg catgaacgtt 644 

tcgaaatcaa agktaagatg cttatagaat tttatgagtc accagatcca gaaagaagaa 704 

aaagatttec tgggaaaagt gttaattcca aattaagtat caagaagact ttaccatcaa 764 

tgttgatctt aa^tggtttg actgcaggca tgcttatgac cgatgctgga aggaagctgt 824 

atgtgaacac ctggatatat ggaaccctac ttggctgcct gtgggttact attaaagcat 884 

agacaagtag ct^tctecag acagtgggat gtgctacatt gtctattttt ggcggctgca 944 

catgacatca aattgtttcc tgaatttatt aaggagtgta aataaagcct tgttgattga 1004 

agattggata atkgaatttg tgacgaaagc tgatatgcaa tggtcttggg caaacatace 1064 

tggttgtaca actttagcat cggggctgct ggaagggtaa aagctaaatg gagtttctcc 1124 

tgctctgtcc atttcctatg aactaatgac aacttgagaa ggctgggagg attgtgtatt 1184 

ttgcaagtca gatggctgca tttttgagea ttaatttgca gcgtatttca c ttt ttc tgt 1244 

tattttcaat ttattacaac ttgaeagctc caagctetta ttactaaagt atttagtate 13 04 

ttgcagctag ttaatatttc atcttttget tatttctaca agtcagtgaa ataaattgta 13 64 

tttaggaagt gteaggatgt tcaaaggaaa gggtaaaaag tgttcatggg gaaaaagctc 1424 

tgtttagcac atgattttat tgtattgcgt tattagctga ttttacteat tttatatttg 1484 

caaaataaat ttctaatatt tattgaaatt gcttaatttg cacaccctgt acacacagaa 1544 

aatggtataa aatatgagaa cgaagtttaa aattgtgact ctgattcatt atagcagaac 1604 

tttaaatttc ccagcttttt gaagatttaa gctaegctat tagtacttcc ctttgtctgt 1664 

gccataagtg cttgaaaacg ttaaggtttt ctgttttgtt ttgttttttt aatatcaaaa 1724 

gagtcggtgt gaaccttggt tggaccccaa gttcacaaga tttttaaggt gatgagagcc 1784 

tgcagacatt ctgcctagat ttactagcgt gtgecttttg cctgcttcte tttgatttca 1844 

cagaatattc attcagaagt cgcgtttctg tagtgtggtg gattcccact gggctctggt 1904 

ccttcccttg gatccegtca gtggtgctgc tcagcggctt gcacgtagac ttgetaggaa 1964 

gaaatgcaga gccagcctgt gctgeccact ttcagagttg aactctttaa gcecttgtga 2024 

gtgggcttca ccagctactg cagaggcatt ttgcatttgt ctgtgtcaag aagttcacct 2084 

tctcaagcca gtgaaataca gacttaattc gtcatgactg aacgaatttg tttatttccc 2144 

attaggttta gtggagctac acattaatat gtatcgcctt agagcaagag ctgtgttcca 2204 

ggaaeeagat cacgattttt agccatggaa caatatatcc catgggagaa gacctttcag 2264 

tgtgaactgt tcfcatttttg tgttataatt taaacttcga tttcctcata gtcctttaag 2324 

ttgacatttc tgtttactgc tactggattt ttgctgcaga aatatatcag tggcccacat 2384 

taaacatacc ag|:tggatca tgataagcaa aatgaaagaa ataatgatta agggaaaatt 2444 

aagtgactgt gttacactgc ttctcccatg ccagagaata aactctttca agcatcatct 2504 

ttgaagagtc gtgtggtgtg aattggtttg tgtacattag aatgtatgca eacatccatg 2564 

gacactcagg atatagttgg cctaataatc ggggcatggg taaaacttat gaaaatttcc 2624 

tcatgctgaa ttgtaatttt ctcttaeetg taaagtaaaa tttagatcaa ttccatgtct 2684 

ttgttaagta cagggattta atatattttg aatataatgg gtatgtteta aatttgaact 2744 

ttgagaggca atactgttgg aattatgtgg attctaactc attttaacaa ggtagcctga 2804 

cctgcataag atcacttgaa tgttaggttt catagaacta tactaatctt ctcacaaaag 2864 

gtctataaaa tacagtcgtt gaaaaaaatt ttgtatcaaa atgtttggaa aattagaagc 2924 

ttctccttaa cctgtattga tactgacttg aattattttc taaaattaag agccgtatac 29 84 

ctacctgtaa gtcttttcac atatcattta aacttttgtt tgtattatta ctgatttaca 3 044 

gettagttat taatttttct ttataagaat gccgtcgatg tgcatgcttt tatgtttttc 3104 

agaaaagggt gtgtttggat gaaagtaaaa aaaaaaataa aatctttcac tgtctctaat 3164 
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ggctgtgctg tttaacattt tttgacccta aaattcacca acagtctccc agtacataaa 
ataggcttaa tgactggccc tgcattcttc acaatatttt tccctaagct ttgagcaaag 
ttttaaaaaa atacactaaa ataatcaaaa ctgttaagca gtatattagt ttggttatat 
aaattcatct gcaatttata agatgcatgg ccgatgttaa tttgcttggc aattctgtaa 
tcattaagtg atctcagtga aacatgtcaa atgccttaaa ttaactaagt tggtgaataa 
aagtgccgat ctggctaact cttacaccat acatactgat agtttttcat atgtttcatt 
tccatgtgat ttttaaaatt tagagtggca acaattttgc ttaatatggg ttacataagc 
tttatttttt cctttgttca taattatatt ctttgaatag gtctgtgtca atcaagtgat 
ctaactagac tgatcataga tagaaggaaa taaggccaag ttcaagacca gcctgggcaa 
catatcgaga acctgtctac aaaaaaatta aaaaaaatta gccaggcatg gtggcgtaca 
ctgagtagtt tgtcccagct actcgggagg gtgaggtggg aggatcgctt cagcccagga 
ggttgagatt gcpgtgagcc atggacatac cactgcacta cagcctaggt aacagcacga 
gaccccaact ct|tagaaaat gaaaaggaaa tatagaaata taaaatttgc ttattataga 
cacacagtaa ctfcccagata tgtaccacaa aaaatgtgaa aagagagaga aatgtctacc 
aaagcagtat ttjtgtgtgta taattgcaag cgcatagtaa aataatttta accttaattt 
gtttttagta gtfcrtttagat tgaagattga gtgaaatatt ttcttggcag atattccgta 
tctggtggaa agfctacaatg caatgtcgtt gtagttttgc atggcttgct ttataaacaa 
gattttttct ccbtcctttt gggccagttt tcattacgag taactcacac tttttgatta 
aagaacttga aabtacgtta tcacttagta taattgacat tatatagaga ctatgtaaca 
tgcaatcatt agpatcaaaa ttagtacttt ggtcaaaata tttacaacat tcacatactt 
gtcaaatatt catgtaatta actgaattta aaaccttcaa ctattatgaa gtgctcgtct 
gtacaatcgc taatttactc agtttagagt agctacaact cttcgatact atcatcaata 
tttgacatct tttccaattt gtgtatgaaa agtaaatcta ttcctgtagc aactggggag 
tcatatatga ggtcaaagac atataccttg ttattataat atgtatacta taataatagc 
tggttatcct gagcagggga aaaggttatt tttaggaaaa ccacttcaaa tagaaagctg 
aagtacttct aatatactga gggaagtata atatgtggaa caaactctca acaaaatgtt 
tattgatgtt gatgaaacag atcagttttt ccatccggat tattattggt tcatgatttt 
atatgtgaat atgtaagata tgttctgcaa ttttataaat gttcatgtct ttttttaaaa 
aaggtgctat tgaaattctg tgtctccagc aggcaagaat acttgactaa ctctttttgt 
ctctttatgg tattttcaga ataaagtctg acttgtgttt ttgagattat tggtgcctca 
ttaattcagc aataaaggaa aatatgcatc tcaaaaat 
<210> 121 
<211> 4958 
<212> DNA 
<213> Homo sapiens 
<220> , 
<221> misc_fe^ture 
<222> 31. .33 ' 
<223> ATG 

<221> inisc„fekture 
<222> 577- .570 
<223> TGA 
<221> polyA_sa.gnal 
<222> 4931. ,4^36 
<223> AATAAA 
<400> 121 I 

ctgctgtccc tggtgctcca cacgtactcc 



gtg etc 
Val Leu 

10 
egg ctg 
Arg Leu 
25 

gac egg 
Asp Arg 

aat tac 
Asn Tyr 



ctg ggc aeg 
Leu Gly Thr 

etc tee gcc 
Leu Ser Ala 

ctg tac tge 
Leu Tyr Cys 
45 

ace ggg gtc 
Thr Gly Val 
60 



gcg ecc ace 
Ala Pro Thr 
15 

ttc ctg ccc 
Phe Leu Pro 
30 

gtc tac cag 
Val Tyr Gin 

cag ata ttg 
Gin lie Leu 



atg cge 
Met Arg 
1 

tac gtg 
Tyr Val 

gcc cge 
Ala Arg 

age atg 
Ser Met 

50 
eta tat 
Leu Tyr 
65 



tac ctg ctg 
Tyr Leu Leu 
5 

ttg gee tgg 
Leu Ala Trp 
20 

ttc tac caa 
Phe Tyr Gin 
35 

gtg etc ttc 
Val Leu Phe 

gga gat ttg 
Gly Asp Leu 



eee age gtc 
Pro Ser Val 

ggg gtc tgg 
Gly Val Trp 



gcg 
Ala 

ttc 
Phe 

cca 
Pro 
70 



ctg gae 
Leu Asp 

40 
ttc gag 
Phe Glu 
55 

aaa aat 
Lys Asn 



3224 

3284 

3344 

3404 

3464 

3524 

3584 

3644 

3704 

3764 

3824 

3884 

3944 

4004 

4064 

4124 

4184 

4244 

4304 

4364 

4424 

4484 

4544 

4604 

4664 

4724 

4784 

4844 

4904 

4964 

5002 



54 



102 



150 



198 



246 
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aaa gaa aat ata ata tat tta gca aat cat caa age aca gtt gac tgg 
Lys Glu Asn lie lie Tyr Leu Ala Asn His Gin Ser Thr Val Asp Trp 

75 80 85 

att gtt get gkc ate ttg gee ate agg cag aat gcg eta gga cat gtg 
He Val Ala App He Leu Ala He Arg Gin Asn Ala Leu Gly His Val 

90 95 100 

cgc tae gtg etg aaa gaa ggg tta aaa tgg ctg eca ttg tat ggg tgt 
Arg Tyr Val Lau Lys Glu Gly Leu Lys Trp Leu Pro Leu Tyr Gly Cys 
105 110 115 120 

tae ttt get c^g cat gga gga ate tat gta aag cgc agt gee aaa ttt 
Tyr Phe Ala GiLn His Gly Gly He Tyr Val Lys Arg Ser Ala Lys Phe 

125 130 135 

aac gag aaa gag atg cga aac aag ttg cag age tae gtg gae gca gga 
Asn Glu Lys Glu Met Arg Asn Lys Leu Gin Ser Tyr Val Asp Ala Gly 

140 145 150 

act eea aat ttc tct gca aag aat gtc' caa aaa tte ata ttc ac*a ttg 
Thr Pro Asn Phe Ser Ala Lys Asn Val Gin Lys Phe He Phe Thr Leu 

155 160 165 

ate gta teg aea aaa aag atg tec cag aag aac aag aac ata tga 
He Val Ser Thr Lys Lys Met Ser Gin Lys Asn Lys Asn He * 

170 175 180 

gaagatgget gcatgaacgt ttcgaaatea aagataagat gcttatagaa ttttatgagt 
caccagatcc agaaagaaga aaaagattte etgggaaaag tgttaattce aaattaagta 
tcaagaagac tttaceatca atgttgatct taagtggttt gactgcaggc atgcttatga 
cegatgctgg aaggaagctg tatgtgaaca cetggatata tggaacccta ettggctgec 
tgtgggttac tattaaagca tagacaagta getgtctcca gacagtggga tgtgctacat 
tgtctatttt tggcggetgc acatgaeatc aaattgtttc ctgaatttat taaggagtgt 
aaataaagcc ttgttgattg aagattggat aatagaattt gtgaegaaag etgatatgea 
atggtcttgg gcaaacatac ctggttgtae aactttagca teggggctgc tggaagggta 
aaagctaaat ggagtttetc ctgctctgte catttcctat gaaetaatga eaacttgaga 
aggetgggag gattgtgtat tttgcaagte agatggetge atttttgagc attaatttgc 
agcgtattte acjtttttetg ttatttteaa tttattaeaa cttgacaget eeaagetctt 
attaetaaag tajtttagtat cttgcagcta gttaatattt catcttttgc ttatttctac 
aagteagtga aataaattgt atttaggaag tgtcaggatg tteaaaggaa agggtaaaaa 
gtgttcatgg ggaaaaagct ctgtttagea catgatttta ttgtattgcg ttattagetg 
attttaetea tt^:tatattt geaaaataaa tttctaatat ttattgaaat tgettaattt 
gcacaecctg tapacacaga aaatggtata aaatatgaga acgaagttta aaattgtgac 
tetgattcat tatagcagaa ctttaaattt cccagetttt tgaagattta agetaegcta 
ttagtacttc ecbttgtetg tgccataagt gcttgaaaac gttaaggttt tctgttttgt 
tttgtttttt taktatcaaa agagteggtg tgaaccttgg ttggacccca agttcaeaag 
atttttaagg tgatgagagc ctgcagaeat tctgcctaga tttaetagcg tgtgcetttt 
gcetgcttet etttgattte acagaatatt eattcagaag tegcgtttet gtagtgtggt 
ggattcccac tgggctctgg tecttccctt ggatcccgtc agtggtgctg etcagegget 
tgcaegtaga cttgctagga agaaatgcag agccagcctg tgetgceeac tttcagagtt 
gaactcttta agccettgtg agtgggettc aecagctact geagaggcat tttgeatttg 
tctgtgtcaa gaagttcaec ttcteaagcc agtgaaatae agaettaatt cgteatgact 
gaacgaattt gtttatttcc cattaggttt agtggagcta cacattaata tgtatcgcet 
tagagcaaga gctgtgttcc aggaaccaga teaegatttt tagceatgga acaatatate 
ceatgggaga agacctttea gtgtgaaetg ttctattttt gtgttataat ttaaaettcg 
atttcctcat agtcctttaa gttgaeattt ctgcttactg ctactggatt tttgctgeag 
aaatatatca gtggcccaca ttaaacatac cagttggatc atgataagea aaatgaaaga 
aataatgatt aagggaaaat taagtgactg tgttacactg cttcteccat gccagagaat 
aaaetetttc aageatcate tttgaagagt cgtgtggtgt gaattggttt gtgtacatta 
gaatgtatgc acacatccat ggacactcag gatatagttg gcctaataat eggggeatgg 
gtaaaactta tgaaaattte etcatgctga attgtaattt tctettaeet gtaaagtaaa 
atttagatca atteeatgtc tttgttaagt acagggattt aatatatttt gaatataatg 
ggtatgttct aaktttgaac tttgagaggc aataetgttg gaattatgtg gattetaact 
cattttaaca ag^tagectg acetgeataa gatcacttga atgttaggtt tcatagaact 
atactaatct tctcaeaaaa ggtctataaa atacagtcgt tgaaaaaaat tttgtateaa 
aatgtttgga aakttagaag ettcteetta acctgtattg atactgactt gaattatttt 
ctaaaattaa gabccgtata cctacctgta agtcttttea catatcattt aaaettttgt 



294 



342 



390 



438 



486 



534 



579 



639 
699 
759 
819 
879 
939 
999 
1059 
1119 
1179 
1239 
1299 
1359 
1419 
1479 
1539 
1599 
1659 
1719 
1779 
1839 
1899 
1959 
2019 
2079 
2139 
2199 
2259 
2319 
2379 
2439 
2499 
2559 
2619 
2679 
2739 
2799 
2859 
2919 
2979 
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ttgtattatt ac'|tgatttac agcttagtta ttaatttttc 
gtgcatgctt tt^tgttttt cagaaaaggg tgtgtttgga 
aaatctttca ctptctctaa tggctgtgat gtttaacatt 
aacagtctcc captacataa aataggctta atgactggcc 
ttccctaagc tttgagcaaa gttttaaaaa aatacactaa 
agtatattag ttjtggttata taaattcatc tgcaatttat 
atttgcttgg cakttctgta atcattaagt gatctcagtg 
attaactaag ttbgtgaata aaagtgccga tctggctaac 
tagtttttca tatgtttcat ttccatgtga tttttaaaat 
cttaatatgg gttacataag ctttattttt tcctttgttc 
ggtctgtgtc aatcaagtga tctaactaga ctgatcatag 
gttcaagacc agcctgggca acatatcgag aacctgtcta 
agccaggcat ggtggcgtac actgagtagt ttgtcccagc 
gaggatcgct tcagcccagg aggttgagat tgcagtgagc 
acagcctagg taacagcacg agaccccaac tcttagaaaa 
ataaaatttg cttattatag ^acacacagta actcccagat 
aaagagagag aaatgtctac caaagcagta ttttgtgtgt 
aaataatttt aaccttaatt tgtttttagt agtgtttaga 
tttcttggca gatattccgt atctggtgga aagctacaat 
catggcttgc tttataaaca agattttttc tccctccttt 
gtaactcaca ctttttgatt aaagaacttg aaattacgtt 
ttatatagag actatgtaac atgcaatcat tagaatcaaa 
atttacaaca ttcacatact tgtcaaatat tcatgtaatt 
actattatga agjtgctcgtc tgtacaatcg ctaatttact 
tcttcgatac tatcatcaat atttgacatc ttttccaatt 
attcctgtag cakctgggga gtcatatatg aggtcaaaga 
tatgtatact ataataatag ctggttatcc tgagcagggg 
accacttcaa at^gaaagct gaagtacttc taatatactg 
acaaactctc aabaaaatgt ttattgatgt tgatgaaaca 
ttattattgg ttpatgattt tatatgtgaa tatgtaagat 
tgttcatgtc tttttttaaa aaaggtgcta ttgaaattct 
tacttgacta acjtctttttg tctctttatg gtattttcag 
tttgagatta tt^gtgcctc attaattcag caataaagga 
<210> 122 
<211> 5094 
<212> DNA 
<213> Homo sapiens 
<220> 

<221> misc„feature 
<222> 31. .33 
<223> ATG 

<221> itdsc_f eature 
<222> 976. .978 
<223> TAG 

<221> polyA__signal 
<222> 5067. .5072 
<223> AATAAA 
<400> 122 

ctgctgtccc tggtgctcca cacgtactcc 



tttataagaa 

tgaaagtaaa 

ttttgaccct 

ctgcattctt 

aataatcaaa 

aagatgcatg 

aaacatgtca 

tcttacacca 

ttagagtggc 

ataattatat 

atagaaggaa 

caaaaaaatt 

tactcgggag 

catggacata 

tgaaaaggaa 

atgtaccaca 

ataattgcaa 

ttgaagattg 

gcaatgtcgt 

tgggccagtt 

atcacttagt 

attagtactt 

aactgaattt 

cagtttagag 

tgtgtatgaa 

catatacctt 

aaaaggttat 

agggaagtat 

gatcagtttt 

atgttctgca 

gtgtctccag 

aataaagtct 

aaatatgcat 



tgccgtcgat 

aaaaaaaata 

aaaattcacc 

cacaatattt 

actgttaagc 

gccgatgtta 

aatgccttaa 

tacatactga 

aacaattttg 

tctttgaata 

ataaggccaa 

aaaaaaaatt 

ggtgaggtgg 

ccactgcact 

atatagaaat 

aaaaatgtga 

gcgcatagta 

agtgaaatat 

tgtagttttg 

ttcattacga 

ataattgaca 

tggtcaaaat 

aaaaccttca 

tagctacaac 

aagtaaatct 

gttattataa 

ttttaggaaa 

aatatgtgga 

tccatccgga 

attttataaa 

caggcaagaa 

gacttgtgtt 

ctcaaaaat 



gtg etc 
Val Leu 

10 
egg ctg 
Arg Leu 
25 

gac egg 
Asp Arg 



ctg 
Leu 



GLy 



etc tpe 
Leu Ser 

etg tkc 
Leu Tyr 



aat tac ace ggg 



acg gcg 
Thr Ala 

gee ttc 
Ala Phe 

30 
tgc gtc 
Cys Val 
45 

gtc cag 



atg cgc tac ctg etg ecc age gtc 
Met Arg Tyr Leu Leu Pro Ser Val 
1 5 
ecc ace tac gtg ttg gee tgg ggg gtc tgg 
Pro Thr Tyr Val Leu Ala Trp Gly Val Trp 
15 20 

ctg cec gcc cgc ttc tac eaa gcg ctg gac 
Leu Pro Ala Arg Phe Tyr Gin Ala Leu Asp 

35 40 
tac cag age atg gtg etc ttc ttc ttc gag 
Tyr Gin Ser Met Val Leu Phe Phe Phe Glu 

50 55 
ata ttg eta tat gga gat ttg cca aaa aat 



3039 
3099 
3159 
3219 
3279 
3339 
3399 
3459 
3519 
3579 
3639 
3699 
3759 
3819 
3879 
^3939 
3999 
4059 
4119 
4179 
4239 
4299 
4359 
4419 
4479 
4539 
4599 
4659 
4719 
4779 
4839 
4899 
4958 



54 
102 
150 
198 
246 
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Asn Tyr Thr GtLy Val Gin lie Leu Leu Tyr Gly Asp Leu Pro Lys Asn 

60 65 70 

aaa gaa aat ata ata tat tta gca aat cat caa age aca gtt gac tgg 294 
Lys Glu Asn lie lie Tyr Leu Ala Asn His Gin Ser Thr Val Asp Trp 

75 80 85 

att gtt get gac ate ttg gcc ate agg cag aat geg eta gga cat gtg 342 
lie Val Ala Asp lie Leu Ala lie Arg Gin Asn Ala Leu Gly His Val 

90 95 100 

cgc tac gtg etg aaa gaa ggg tta aaa tgg etg cca ttg tat ggg tgt 390 
Arg Tyr Val Leu Lys Glu Gly Leu Lys Trp Leu Pro Leu Tyr Gly Cys 
105 110 115 120 

tac ttt get cag cat gga gga ate tat gta aag cgc agt gee aaa ttt 438 
Tyr Phe Ala Gin His Gly Gly lie Tyr Val Lys Arg Ser Ala Lys Phe 

125 130 135 

aac gag aaa gag atg cga aac aag ttg cag age tac gtg gae gca gga 486 
Asn Glu Lys Glu Met Arg Asn Lys'^Leu Gin Ser Tyr Val Asp Ala Gly 

140 145 150 

act cca atg tkt ctt gtg att ttt cca gaa ggt aca agg tat aat cca 534 
Thr Pro Met Tyr Leu Val lie Phe Pro Glu Gly Thr Arg Tyr Asn Pro 

155 ! 160 165 

gag caa aca aaa gtc ctt tea get agt cag gca ttt get gcc caa cgt 582 
Glu Gin Thr Lys Val Leu Ser Ala Ser Gin Ala Phe Ala Ala Gin Arg 

170 ; 175 180 

ggg aaa gae gat gga ggg cag cga aga gag tea ccg ace atg aeg gaa 630 
Gly Lys Asp Asp Gly Gly Gin Arg Arg Glu Ser Pro Thr Met Thr Glu 
185 I 190 195 200 

ttt etc tgc aaa gaa tgt cca aaa att cat att cac att gat cgt ate 678 
Phe Leu Cys Lys Glu Cys Pro Lys lie His lie His lie Asp Arg lie 

205 210 215 

gae aaa aaa gat gtc cca gaa gaa caa gaa cat atg aga aga tgg etg 726 
Asp Lys Lys Asp Val Pro Glu Glu Gin Glu His Met Arg Arg Trp Leu 

220 225 230 

cat gaa cgt ttc gaa ate aaa gat aag atg ctt ata gaa ttt tat gag 774 
His Glu Arg Phe Glu lie Lys Asp Lys Met Leu He Glu Phe Tyr Glu 

235 240 245 

tea eca gat cca gaa aga aga aaa aga ttt cct ggg aaa agt gtt aat 822 
Ser Pro Asp Pro Glu Arg Arg Lys Arg Phe Pro Gly Lys Ser Val Asn 

250 255 260 

tec aaa tta agt ate aag aag act tta cca tea atg ttg ate tta agt 870 
Ser Lys Leu Ser lie Lys Lys Thr Leu Pro Ser Met Leu He Leu Ser 
265 270 275 280 

ggt ttg act gca ggc atg ctt atg ace gat get gga agg aag etg tat 918 
Gly Leu Thr Ala Gly Met Leu Met Thr Asp Ala Gly Arg Lys Leu Tyr 

I 285 290 295 

gtg aae ace t^g ata tat gga acc eta ctt ggc tgc etg tgg gtt act 966 
Val Asn Thr Tjrp He Tyr Gly Thr Leu Leu Gly Cys Leu Trp Val Thr 

3b0 305 310 

att aaa gca tkg acaagtaget gtetccagac agtgggatgt gctacattgt 1018 
He Lys Ala h 
315 , 

ctatttttgg eggctgcaca tgacatcaaa ttgtttcctg aatttattaa ggagtgtaaa 1078 
taaagccttg ttgattgaag attggataat agaatttgtg aegaaagctg atatgcaatg 113 8 
gtettgggca aacatacetg gttgtacaac tttagcatcg gggctgctgg aagggtaaaa 1198 
gctaaatgga gtttctcctg ctctgteeat tteetatgaa ctaatgacaa ettgagaagg 1258 
ctgggaggat tgtgtatttt gcaagtcaga tggctgcatt tttgagcatt aatttgcage 1318 
gtatttcact ttttctgtta ttttcaattt attacaactt gaeagetcea agetcttatt 1378 
aetaaagtat ttagtatctt gcagetagtt aatatttcat ettttgctta tttetaeaag 1438 
tcagtgaaat aaattgtatt taggaagtgt caggatgttc aaaggaaagg gtaaaaagtg 1498 
ttcatgggga aaaagctctg tttagcacat gattttattg tattgcgtta ttagctgatt 1558 
ttacteattt tatatttgca aaataaattt ctaatattta ttgaaattgc ttaatttgea 1618 
cacectgtac acacagaaaa tggtataaaa tatgagaacg aagtttaaaa ttgtgactet 1678 
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gattcattat agcagaactt 
gtacttccct ttgtctgtgc 
gtttttttaa tatcaaaaga 
tttaaggtga tgagagcctg 
tgcttctctt tgatttcaca 
ttcccactgg gctctggtcc 
acgtagactt gctaggaaga 
ctctttaagc ccttgtgagt 
gtgtcaagaa gttcaccttc 
cgaatttgtt tatttcccat 
agcaagagct gtgttccagg 
tgggagaaga cctttcagtg 
tcctcatagt ccfcttaagtt 
tatatcagtg gcbcacatta 
aatgattaag gg^aaattaa 
ctctttcaag catcatcttt 
tgtatgcaca capccatgga 
aaacttatga aaktttcctc 
tagatcaatt ccktgtcttt 
atgttctaaa tt^gaacttt 
. tttaacaagg tagcctgacc 
ctaatettct cacaaaaggt 
gtttggaaaa ttagaagctt 
aaattaagag ccgtatacct 
tattattact gatttacagc 
catgctttta tgtttttcag 
tctttcactg tctctaatgg 
agtctcccag tacataaaat 
cctaagcttt gagcaaagtt 
atattagttt ggttatataa 
tgcttggcaa ttctgtaatc 
aactaagttg gtgaataaaa 
tttttcatat gtttcatttc 
aatatgggtt acataagctt 
ctgtgtcaat caagtgatct 
caagaccagc ctgggcaaca 
caggcatggt ggcgtacact 
gatcgcttca gcccaggagg 
gcctaggtaa cagcacgaga 
aaatttgctt attatagaca 
gagagagaaa tgtctaccaa 
taattttaac ctpaatttgt 
cttggcagat atpccgtatc 
ggcttgcttt at&aacaaga 
actcacactt ttpgattaaa 
tatagagact atgtaacatg 
tacaacattc acatacttgt 
attatgaagt gctcgtctgt 
tcgatactat catcaatatt 
cctgtagcaa ctggggagtc 
gtatactata ataatagctg 
acttcaaata gaaagctgaa 
aactctcaac aaaatgttta 
ttattggttc atgattttat 
tcatgtcttt ttttaaaaaa 
ttgactaact ctttttgtct 
gagattattg gtgcctcatt 
<210> 123 
<211> 5049 
<212> DNA 

<213> Homo sapiens 



taaatttccc 
cataagtgct 
gtcggtgtga 
cagacattct 
gaatattcat 
ttcccttgga 
aatgcagagc 
gggcttcacc 
tcaagccagt 
taggtttagt 
aaccagatca 
tgaactgttc 
gacatttctg 
aacataccag 
gtgactgtgt 
gaagagtcgt 
cactcaggat 
atgctgaatt 
gttaagtaca 
gagaggcaat 
tgcataagat 
ctataaaata 
ctccttaacc 
acctgtaagt 
ttagttatta 
aaaagggtgt 
ctgtgctgtt 
aggcttaatg 
ttaaaaaaat 
attcatctgc 
attaagtgat 
gtgccgatct 
catgtgattt 
tattttttcc 
aactagactg 
tatcgagaac 
gagtagtttg 
ttgagattgc 
ccccaactct 
cacagtaact 
agcagtattt 
ttttagtagt 
tggtggaaag 
ttttttctcc 
gaacttgaaa 
caatcattag 
caaatattca 
acaatcgcta 
tgacatcttt 
atatatgagg 
gttatcctga 
gtacttctaa 
ttgatgttga 
atgtgaatat 
ggtgctattg 
ctttatggta 
aattcagcaa 



agctttttga 
tgaaaacgtt 
accttggttg 
gcctagattt 
tcagaagtcg 
tcccgtcagt 
cagcctgtgc 
agctactgca 
gaaatacaga 
ggagctacac 
cgatttttag 
tatttttgtg 
cttactgcta 
ttggatcatg 
tacactgctt 
gtggtgtgaa' 
atagttggcc 
gtaattttct 
gggatttaat 
actgttggaa 
cacttgaatg 
cagtcgttga 
tgtattgata 
cttttcacat 
atttttcttt 
gtttggatga 
taacattttt 
actggccctg 
acactaaaat 
aatttataag 
ctcagtgaaa 
ggctaactct 
ttaaaattta 
tttgttcata 
atcatagata 
ctgtctacaa 
tcccagctac 
agtgagccat 
tagaaaatga 
cccagatatg 
tgtgtgtata 
gtttagattg 
ctacaatgca 
ctccttttgg 
ttacgttatc 
aatcaaaatt 
tgtaattaac 
atttactcag 
tccaatttgt 
tcaaagacat 
gcaggggaaa 
tatactgagg 
tgaaacagat 
gtaagatatg 
aaattctgtg 
ttttcagaat 
taaaggaaaa 



agatttaagc 
aaggttttct 
gaccccaagt 
actagcgtgt 
cgtttctgta 
ggtgctgctc 
tgcccacttt 
gaggcatttt 
cttaattcgt 
attaatatgt 
ccatggaaca 
ttataattta 
ctggattttt 
ataagcaaaa 
ctcccatgcc 
"ttggtttgtg 
taataatcgg 
cttacctgta 
atattttgaa 
ttatgtggat 
ttaggtttca 
aaaaaatttt 
ctgacttgaa 
atcatttaaa 
ataagaatgc 
aagtaaaaaa 
tgaccctaaa 
cattcttcac 
aatcaaaact 
atgcatggcc 
catgtcaaat 
tacaccatac 
gagtggcaac 
attatattct 
gaaggaaata 
aaaaattaaa 
tcgggagggt 
ggacatacca 
aaaggaaata 
taccacaaaa 
attgcaagcg 
aagattgagt 
atgtcgttgt 
gccagttttc 
acttagtata 
agtactttgg 
tgaatttaaa 
tttagagtag 
gtatgaaaag 
ataccttgtt 
aggttatttt 
gaagtataat 
cagtttttcc 
ttctgcaatt 
tctccagcag 
aaagtctgac 
tatgcatctc 



tacgctatta 
gttttgtttt 
tcacaagatt 
gccttttgcc 
gtgtggtgga 
agcggcttgc 
cagagttgaa 
gcatttgtct 
catgactgaa 
atcgccttag 
atatatccca 
aacttcgatt 
gctgcagaaa 
tgaaagaaat 
agagaataaa 
tacattagaa " 
ggcatgggta 
aagtaaaatt 
tataatgggt 
tctaactcat 
tagaactata 
gtatcaaaat 
ttattttcta 
cttttgtttg 
cgtcgatgtg 
aaaaataaaa 
attcaccaac 
aatatttttc 
gttaagcagt 
gatgttaatt 
gccttaaatt 
atactgatag 
aattttgctt 
ttgaataggt 
aggccaagtt 
aaaaattagc 
gaggtgggag 
ctgcactaca 
tagaaatata 
aatgtgaaaa 
catagtaaaa 
gaaatatttt 
agttttgcat 
attacgagta 
attgacatta 
tcaaaatatt 
accttcaact 
ctacaactct 
taaatctatt 
attataatat 
taggaaaacc 
atgtggaaca 
atccggatta 
ttataaatgt 
gcaagaatac 
ttgtgttttt 
aaaaat 



1738 

1798 

1858 

1918 

1978 

2038 

2098 

2158 

2218 

2278 

2338 

2398 

2458 

2518 

2578 

2638 

2698 

2758 

2818 

2878 

2938 

2998 

3058 

3118 

3178 

3238 

3298 

3358 

3418 

3478 

3538 

3598 

3658 

3718 

3778 

3838 

3898 

3958 

4018 

4078 

4138 

4198 

4258 

4318 

4378 

4438 

4498 

4558 

4618 

4678 

4738 

4798 

4858 

4918 

4978 

5038 

5094 
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77 



54 



102 



150 



294 



<220> 

<221> itiisc_f eature 
<222> 31,-33 
<223> ATG . 
<221> misc_f eature 
<222> 931. .93B 
<223> TAG 

<221> polyA_sLgnal 
<222> 5022. .5p27 
<223> AATAAA 
<400> 123 

ctgctgtccc tgbtgctcca cacgtactcc atg cgc tac ctg ctg ccc age gtc 
r Met Arg Tyr Leu Leu Pro Ser Val 

gtg etc ctg ggc acg geg ecc ace tac gtg ttg gcc tgg ggg gtc tgg 
Val Leu Leu Gly Thr Ala Pro Thr Tyr Val Leu Ala Trp Gly Val Tf p 

10 15 20 

egg ctg etc tec gee ttc ctg ecc gcc cgc ttc tac caa gcg ctg gac 
Arg Leu Leu Ser Ala Phe Leu Pro Ala Arg Phe Tyr Gin Ala Leu Asp 
25 30 35 40 

gac egg ctg tac tgc gtc tac cag age atg gtg etc ttc ttc ttc gag 198 
Asp Arg Leu Tyr Cys Val Tyr Gin Ser Met Val Leu Phe Phe Phe Glu 

45 50 55 

aat tac ace ggg gtc cag ata ttg eta tat gga gat ttg cea aaa aat 246 
Asn Tyr Thr Gly Val Gin lie Leu Leu Tyr Gly Asp Leu Pro Lys Asn 

60 65 70 

aaa gaa aat ata ata tat tta gea aat eat caa age aca gtt gac tgg 
Lys Glu Asn lie He Tyr Leu Ala Asn His Gin Ser Thr Val Asp Trp 

75 80 85 ■ 

att gtt get gac ate ttg gee ate agg cag aat gcg eta gga eat gtg 342 
lie Val Ala Asp He Leu Ala He Arg Gin Asn Ala Leu Gly His Val 

90 ' 95 100 

cgc tac gtg ctg aaa gaa ggg tta aaa tgg ctg cea ttg tat ggg tgt 390 
Arg Tyr Val Lfeu Lys Glu Gly Leu Lys Trp Leu Pro Leu Tyr Gly Cys 
105 110 115 120 

tac ttt get cag cat gga gga ate tat gta aag cgc agt gee aaa ttt 43 8 

Tyr Phe Ala Gin His Gly Gly lie Tyr Val Lys Arg Ser Ala Lys Phe 

1 125 130 135 

aac gag aaa gag atg cga aac aag ttg cag age tac gtg gac gea gga 486 
Asn Glu Lys gLu Met Arg Asn Lys Leu Gin Ser Tyr Val Asp Ala Gly 

140 145 150 

act cea atg tat ctt gtg att ttt cea gaa ggt aca agg tat aat cea 534 
Thr Pro Met Tyr Leu Val He Phe Pro Glu Gly Thr Arg Tyr Asn Pro 

155 160 165 

gag caa aca aaa gtc ctt tea get agt cag gea ttt get gcc caa cgt 582 
Glu Gin Thr Lys Val Leu Ser Ala Ser Gin Ala Phe Ala Ala Gin Arg 

170 175 180 

gaa ttt etc tgc aaa gaa tgt cea aaa att cat att cac att gat cgt 630 
Glu Phe Leu Cys Lys Glu Cys Pro Lys He His He His He Asp Arg 
185 190 195 200 

ate gac aaa aaa gat gtc cea gaa gaa caa gaa cat atg aga aga tgg 678 
He Asp Lys Lys Asp Val Pro Glu Glu Gin Glu His Met Arg Arg Trp 

205 210 215 

ctg cat gaa cgt ttc gaa ate aaa gat aag atg ctt ata gaa ttt tat 726 
Leu His Glu Arg Phe Glu He Lys Asp Lys Met Leu He Glu Phe Tyr 

220 225 230 

gag tea cea gat cea gaa aga aga aaa aga ttt cet ggg aaa agt gtt 774 
Glu Ser Pro A^p Pro Glu Arg Arg Lys Arg Phe Pro Gly Lys Ser Val 

235 } 240 245 

aat tec aaa tta agt ate aag aag act tta cea tea atg ttg ate tta 822 
Asn Ser Lys Lfeu Ser He Lys Lys Thr Leu Pro Ser Met Leu He Leu 
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250 i 255 260 

agt ggt ttg act gca ggc atg ctt atg acc gat get gga agg aag ctg 870 
Ser Gly Leu Thr Ala Gly Met Leu Met Thr Asp Ala Gly Arg Lys Leu 
265 270 275 280 

tat gtg aac acc tgg ata tat gga acc eta ctt ggc tgc ctg tgg gtt 918 
Tyr Val Asn Thr Trp lie Tyr Gly Thr Leu Leu Gly Cys Leu Trp Val 

285 290 295 

act att aaa gca tag acaagtagct gtctccagac agtgggatgt gctacattgt 973 
Thr lie Lys Ala * 
300 

ctatttttgg cggctgcaca tgacatcaaa ttgtttcctg aatttattaa ggagtgtaaa 1033 

taaagccttg ttgattgaag attggataat agaatttgtg acgaaagctg atatgcaatg 1093 

gtcttgggca aacatacctg gttgtacaac tttagcatcg gggctgctgg aagggtaaaa 1153 

gctaaatgga gtttctcctg ctctgtccat ttcctatgaa ctaatgacaa cttgagaagg 1213 

ctgggaggat tgtgtatttt gcaagtcaga tggctgcatt tttgagcatt aatttgcagc 1273 

gtatttcact ttttctgtta ttttcaattt attacaactt gacagctcca agctcttatt 13-33 

actaaagtat ttagtatctt gcagctagtt aatatttcat cttttgctta tttctacaag 1393 

tcagtgaaat aaattgtatt taggaagtgt caggatgttc aaaggaaagg gtaaaaagtg 1453 

ttcatgggga aaaagctctg tttagcacat gattttattg tattgcgtta ttagctgatt 1513 

ttactcattt tatatttgca aaataaattt ctaatattta ttgaaattgc ttaatttgca 1573 

caccctgtac acacagaaaa tggtataaaa tatgagaacg aagtttaaaa ttgtgactct 1633 

gattcattat agcagaactt taaatttccc agctttttga agatttaagc tacgctatta 1693 

gtacttccct ttgtctgtgc cataagtgct tgaaaacgtt aaggttttct gttttgtttt 1753 

gtttttttaa tatcaaaaga gtcggtgtga accttggttg gaccccaagt tcacaagatt 1813 

tttaaggtga tgagagcctg cagacattct gcctagattt actagcgtgt gccttttgcc 1873 

tgcttctctt tgatttcaca gaatattcat tcagaagtcg cgtttctgta gtgtggtgga 1933 

ttcccactgg gctctggtcc ttcccttgga tcccgtcagt ggtgctgctc agcggcttgc 1993 

acgtagactt gctaggaaga aatgcagagc cagcctgtgc tgcccacttt cagagttgaa 2053 

ctctttaagc ccttgtgagt gggcttcacc agctactgca gaggcatttt gcatttgtct 2113 

gtgtcaagaa gttcaccttc tcaagccagt gaaatacaga cttaattcgt catgactgaa 2173 

cgaatttgtt tatttcccat taggtttagt ggagctacac attaatatgt atcgccttag 2233 

agcaagagct gtgttccagg aaccagatca cgatttttag ccatggaaca atatatccca 2293 

tgggagaaga cctttcagtg tgaactgttc tatttttgtg ttataattta aacttcgatt 2353 

tcctcatagt cctttaagtt gacatttctg cttactgcta ctggattttt gctgcagaaa 2413 

tatatcagtg gcccacatta aacataccag ttggatcatg ataagcaaaa tgaaagaaat 2473 

aatgattaag ggaaaattaa gtgactgtgt tacactgctt ctcccatgcc agagaataaa 2533 

ctctttcaag catcatcttt gaagagtcgt gtggtgtgaa ttggtttgtg tacattagaa 2593 

tgtatgcaca catccatgga cactcaggat atagttggcc taataatcgg ggcatgggta 2653 

aaacttatga aaatttcctc atgctgaatt gtaattttct cttacctgta aagtaaaatt 2713 

tagatcaatt ccatgtcttt gttaagtaca gggatttaat atattttgaa tataatgggt 2773 

atgttctaaa tttgaacttt gagaggcaat actgttggaa ttatgtggat tctaactcat 2833 

tttaacaagg tagcctgacc tgcataagat cacttgaatg ttaggtttca tagaactata 2893 

ctaatcttct cacaaaaggt ctataaaata cagtcgttga aaaaaatttt gtatcaaaat 2953 

gtttggaaaa ttagaagctt ctccttaacc tgtattgata ctgacttgaa ttattttcta 3013 

aaattaagag ccgtatacct acctgtaagt cttttcacat atcatttaaa cttttgtttg 3073 

tattattact gatttacagc ttagttatta atttttcttt ataagaatgc cgtcgatgtg 3133 

catgctttta tgtttttcag aaaagggtgt gtttggatga aagtaaaaaa aaaaataaaa 3193 

tctttcactg tctctaatgg ctgtgctgtt taacattttt tgaccctaaa attcaccaac 3253 

agtctcccag tacataaaat aggcttaatg actggccctg cattcttcac aatatttttc 3313 

cctaagcttt gagcaaagtt ttaaaaaaat acactaaaat aatcaaaact gttaagcagt 3373 

atattagttt ggttatataa attcatctgc aatttataag atgcatggcc gatgttaatt 3 433 

tgcttggcaa ttctgtaatc attaagtgat ctcagtgaaa catgtcaaat gccttaaatt 3493 

aactaagttg gtgaataaaa gtgccgatct ggctaactct tacaccatac atactgatag 3553 

tttttcatat gtttcatttc catgtgattt ttaaaattta gagtggcaac aattttgctt 3613 

aatatgggtt acftaagctt tattttttcc tttgttcata attatattct ttgaataggt 3 673 

ctgtgtcaat caagtgatct aactagactg atcatagata gaaggaaata aggccaagtt 3733 

caagaccagc ct^ggcaaca tatcgagaac ctgtctacaa aaaaattaaa aaaaattagc 3793 

caggcatggt ggcgtacact gagtagtttg tcccagctac tcgggagggt gaggtgggag 3 853 

gatcgcttca gcccaggagg ttgagattgc agtgagccat ggacatacca ctgcactaca 3913 

gcctaggtaa cagcacgaga ccccaactct tagaaaatga aaaggaaata tagaaatata 3973 

aaatttgctt attatagaca cacagtaact cccagatatg taccacaaaa aatgtgaaaa 4033 
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gagagagaaa tgtctaccaa agcagtattt tgtgtgtata attgcaagcg catagtaaaa 
taattttaac ct'taatttgt ttttagtagt gtttagattg aagattgagt gaaatatttt 
cttggcagat attccgtatc tggtggaaag ctacaatgca atgtcgttgt agttttgcat 
ggcttgcttt ataaacaaga ttttttctcc ctccttttgg gccagttttc attacgagta 
actcacactt tttgattaaa gaacttgaaa ttacgttatc acttagtata attgacatta 
tatagagact atgtaacatg caatcattag aatcaaaatt agtactttgg tcaaaatatt 
tacaacattc acatacttgt caaatattca tgtaattaac. tgaatttaaa accttcaact 
attatgaagt gctcgtctgt acaatcgcta atttactcag tttagagtag ctacaactct 
tcgatactat catcaatatt tgacatcttt tccaatttgt gtatgaaaag taaatctatt 
cctgtagcaa ctggggagtc atatatgagg tcaaagacat ataccttgtt attataatat 
gtatactata ataatagctg gttatcctga gcaggggaaa aggttatttt taggaaaacc 
acttcaaata gaaagctgaa gtacttctaa tatactgagg gaagtataat atgtggaaca 
aactctcaac aaaatgttta ttgatgttga tgaaacagat cagtttttcc atccggatta 
ttattggttc atgattttat atgtgaatat gtaagatatg ttctgcaatt ttataaatgt 
tcatgtcttt ttttaaaaaa ggtgctattg aaattctgtg tctccagcag gcaagaatac 
ttgactaact ctttttgtct ctttatggta ttttcagaat aaagtctgac ttgt^tttt 
gagattattg gtgcctcatt aattcagcaa taaaggaaaa tatgcatctc aaaaat 
<210> 124 
<211> 5324 
<212> DNA 

<213> Homo sapiens 
<220> '= i 
<221> misc^feature 
<222> 31. .33 
<223> ATG 

<221> misc_feature 
<222> 586. .58b 
<223> TAA 

<221> polyA_signal 
<222> 5297,. 5302 
<223> AATAAA 
<400> 124 

ctgctgtccc tggtgctcca cacgtactcc 



gtg etc ctg 
Val Leu Leu 
10 

egg etg cte 
Arg Leu Leu 
25 

gac egg ctg 
Asp Arg Leu 

aat tac aec 
Asn Tyr Thr 



aaa gaa 
Lys Glu 

att gtt 
He Val 

90 
cge tae 
Arg Tyr 
105 

tac ttt 
Tyr Phe 



aae gag aaa 
Asn Glu Lys 



ggc 

Gly 



acg 
Thr 

gee 
Ala 



gcg ccc aec 
Ala Pro Thr 
15 

ttc ctg cce 
Phe Leu Pro 
30 

gte tac cag 
Val Tyr Gin 

cag ata ttg 
Gin He Leu 




atg cge 
Met Arg 
1 

tac gtg 
Tyr Val 

gee cge 
Ala Arg 



tae ctg 
Tyr Leu 



age 
Ser 

eta 

Leu 

65 

aat 

Asn 



atg 

Met 

50 

tat 

Tyr 

cat 
His 



ttg 
Leu 

ttc 
Phe 
35 
gtg 
Val 



gee 

Ala 

20 

tac 

Tyr 

etc 
Leu 



ctg ccc age gte 
Leu Pro Ser Val 
5 

tgg ggg gte tgg 
Trp Gly Val Trp 



caa gcg 
Gin Ala 

ttc ttc 
Phe Phe 



gga gat 
Gly Asp 

caa age 
Gin Ser 



gag 
Glu 



gga ate 
Gly He 

aae aag 
Asn Lys 



agg cag 
Arg Gin 

aaa tgg 
Lys Trp 

tat gta 
Tyr Val 
130 
ttg cag 
Leu Gin 



aat 
Asn 

ctg 
Leu 
115 
aag 
Lys 



gcg 
Ala 
100 
cea 
Pro 

cge 
Arg 



ttg 
Leu 

aca 

Thr 

85 

eta 

Leu 



cea 

Pro 

70 

gtt 

Val 

gga 
Gly 



ctg 
Leu 

ttc 

Phe 

55 

aaa 

Lys 



gac 

Asp 

40 

gag 

Glu 

aat 
Asn 



age tac 
Ser Tyr 



ttg tat 
Leu Tyr 

agt gee 
Ser Ala 

gtg gac 
Val Asp 



gac tgg 
Asp Trp 

eat gtg 
His Val 

ggg tgt 
Gly Cys 
120 
aaa ttt 
Lys Phe 
135 

gca gga 
Ala Gly 



4093 
4153 
4213 
4273 
4333 
4393 
4453 
4513 
4573 
4633 
4693 
4753 
4813 
4873 
4933 
4993 
5049 



54 
102 
150 
198 
246 
294 
342 
390 
438 
486 
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140 145 150 

act cca atg tat ctt gtg att ttt cca gaa- ggt aca agg tat aat cca 534 
Thr Pro Met Tyr Leu Val lie Phe Pro Glu Gly Thr Arg Tyr Asn Pro 

155 160 165 

gag caa aca aaa gtc ctt tea get agt cag gca ttt get gcc caa cgt 582 
Glu Gin Thr Lys Val Leu Ser Ala Ser Gin Ala Phe Ala Ala Gin Arg 

170 175 180 

ggc taa agcagtcctc ctgagtagtt aggactacag acatacacgt gccaccgcgc 638 
Gly * 
185 

ccagctccgt gttctctttg tttccctgcc tcctgctctt ccacttatct ttgcatggca 698 

ggccttgcag tattaaaaca tgtgctaaca ccacgaataa aggcaactca cgttgctttt 758 

gattgcatga agaattattt agatgcaatt tatgatgtta cggtggttta tgaagggaaa 818 

gacgatggag ggcagcgaag agagtcaccg accatgacgg aatttctctg caaagaatgt 878 

ccaaaaattc atkttcacat tgatcgtatc gacaaaaaag atgtcccaga agaacaagaa 938 

catatgagaa gatggctgca tgaacgtttc gaaatcaaag ataagatgct tatagaattt '"998 

tatgagtcac cagatccaga aagaagaaaa agatttcctg ggaaaagtgt taattccaaa 1058 

ttaagtatca agaagacttt accatcaatg ttgatcttaa gtggtttgac tgcaggcatg 1118 

cttatgaccg atgctggaag gaagctgtat gtgaacacct ggatatatgg aaccctactt 1178 

ggctgcctgt gggttactat taaagcatag acaagtagct gtctccagac agtgggatgt 123 8 

gctacattgt ctatttttgg cggctgcaca tgacatcaaa ttgtttcctg aatttattaa 1298 

ggagtgtaaa taaagccttg ttgattgaag attggataat agaatttgtg acgaaagctg 1358 

atatgcaatg gtcttgggca aacatacctg gttgtacaac tttagcatcg gggctgctgg 1418 

aagggtaaaa gctaaatgga gtttctcctg ctctgtccat ttcctatgaa ctaatgacaa 1478 

cttgagaagg ctgggaggat tgtgtatttt gcaagtcaga tggctgcatt tttgagcatt 1538 

aatttgcagc gtatttcact ttttctgtta ttttcaattt attacaactt gacagctcca 1598 

agctcttatt actaaagtat ttagtatctt gcagctagtt aatatttcat cttttgctta 1658 

tttctacaag tcagtgaaat aaattgtatt taggaagtgt caggatgttc aaaggaaagg 1718 

gtaaaaagtg ttcatgggga aaaagctctg tttagcacat gattttattg tattgcgtta 1778 

ttagctgatt ttactcattt tatatttgca aaataaattt ctaatattta ttgaaattgc 1838 

ttaatttgca caccctgtac acacagaaaa tggtataaaa tatgagaacg aagtttaaaa 1898 

ttgtgactct gattcattat agcagaactt taaatttccc agctttttga agatttaagc 1958 

tacgctatta gtacttccct ttgtctgtgc cataagtgct tgaaaacgtt aaggttttct 2018 

gttttgtttt gtttttttaa tatcaaaaga gtcggtgtga accttggttg gaccccaagt 2078 

tcacaagatt tttaaggtga tgagagcctg cagacattct gcctagattt actagcgtgt 2138 

gccttttgcc tgcttctctt tgatttcaca gaatattcat tcagaagtcg cgtttctgta 2198 

gtgtggtgga ttcccactgg gctctggtcc ttcccttgga tcccgtcagt ggtgctgctc 2258 

agcggcttgc acgtagactt gctaggaaga aatgcagagc cagcctgtgc tgcccacttt 2318 

cagagttgaa ctctttaagc ccttgtgagt gggcttcacc agctactgca gaggcatttt 2378 

gcatttgtct gtgtcaagaa gttcaccttc tcaagccagt gaaatacaga cttaattcgt 2438 

catgactgaa cgaatttgtt tatttcccat taggtttagt ggagctacac attaatatgt 2498 

atcgccttag agpaagagct gtgttccagg aaccagatca cgatttttag ccatggaaca 2558 

atatatccca tg^gagaaga cctttcagtg tgaactgttc tatttttgtg ttataattta 2618 

aacttcgatt tcctcatagt cctttaagtt gacatttctg cttactgcta ctggattttt 2678 

gctgcagaaa tatatcagtg gcccacatta aacataccag ttggatcatg ataagcaaaa 2738 

tgaaagaaat aatgattaag ggaaaattaa gtgactgtgt tacactgctt ctcccatgcc 2798 

agagaataaa ctctttcaag catcatcttt gaagagtcgt gtggtgtgaa ttggtttgtg 2858 

tacattagaa tgtatgcaca catccatgga cactcaggat atagttggcc taataatcgg 2918 

ggcatgggta aaacttatga aaatttcctc atgctgaatt gtaattttct cttacctgta 2978 

aagtaaaatt tagatcaatt ccatgtcttt gttaagtaca gggatttaat atattttgaa 3038 

tataatgggt atgttctaaa tttgaacttt gagaggcaat actgttggaa ttatgtggat 3098 

tctaactcat tttaacaagg tagcctgacc tgcataagat cacttgaatg ttaggtttca 3158 

tagaactata ctaatcttct cacaaaaggt ctataaaata cagtcgttga aaaaaatttt 3218 

gtatcaaaat gtttggaaaa ttagaagctt ctecttaacc tgtattgata ctgacttgaa 3278 

ttattttcta aaattaagag ccgtatacct acctgtaagt cttttcacat atcatttaaa 3338 

cttttgtttg tattattact gatttacagc ttagttatta atttttcttt ataagaatgc 3398 

cgtcgatgtg catgctttta tgtttttcag aaaagggtgt gtttggatga aagtaaaaaa 3458 

aaaaataaaa tctttcactg tctctaatgg ctgtgctgtt taacattttt tgaccctaaa 3518 

attcaccaac agtctcccag tacataaaat aggcttaatg actggccctg cattcttcac 3 578 

aatatttttc cctaagcttt gagcaaagtt ttaaaaaaat acactaaaat aatcaaaact 3 638 

gttaagcagt atattagttt ggttatataa attcatctgc aatttataag atgcatggcc 3698 



wo 99/32644 



81 



PCTAB98/02133 



gatgttaatt tgcttggcaa ttctgtaatc 
gccttaaatt aactaagttg gtgaataaaa 
atactgatag tttttcatat gtttcatttc 
aattttgctt aatatgggtt acataagctt 
ttgaataggt ctgtgtcaat caagtgatct 
aggccaagtt ca^gaccagc ctgggcaaca 
aaaaattagc caggcatggt ggcgtacact 
gaggtgggag gatcgcttca gcccaggagg 
ctgcactaca gcptaggtaa cagcacgaga 
tagaaatata aaktttgctt attatagaca 
aatgtgaaaa gagagagaaa tgtctaccaa 
catagtaaaa takttttaac cttaatttgt 
gaaatatttt cttggcagat attccgtatc 
agttttgcat ggbttgcttt ataaacaaga 
attacgagta actcacactt tttgattaaa 
attgacatta tatagagact atgtaacatg 
tcaaaatatt tacaacattc acatacttgt 
accttcaact attatgaagt gctcgtctgt 
ctacaactct tcgatactat catcaatatt 
taaatctatt cctgtagcaa ctggggagtc 
attataatat gtatactata ataatagctg 
taggaaaacc acttcaaata gaaagctgaa 
atgtggaaca aactctcaac aaaatgttta 
atccggatta ttattggttc atgattttat 
ttataaatgt tcatgtcttt ttttaaaaaa 
gcaagaatac ttgactaact ctttttgtct 
ttgtgttttt gaqattattg gtgcctcatt 
aaaaat 
<210> 125 
<211> 77 
<212> PRT 
<213> Homo sabiens 
<400> 125 1 

Leu Leu Pro Ser Val 

Ajla Trp Gly Val Trp 
2p 

Ala Arg Phe Tyr Gin Ala Leu Asp 

35 i 40 
Ser Met Val Leu Phe Phe Phe Glu 

50 55 
Gly Leu Leu Leu Thr Ser Trp Pro 
65 70 
<210> 126 
<211> 23B 
<212> PRT 
<213> Homo sapiens 
<220> 

SITE 
98. .103 
II 



attaagtgat 
gtgccgatct 
catgtgattt 
tattttttcc 
aactagactg 
tatcgagaac 
gagtagtttg 
ttgagattgc 
ccccaactct 
cacagtaact 
agcagtattt 
ttttagtagt 
tggtggaaag 
ttttttctcc 
gaacttgaaa 
daatcattag 
caaatattca 
acaatcgcta 
tgacatcttt 
atatatgagg 
gttatcctga 
gtacttctaa 
ttgatgttga 
atgtgaatat 
ggtgctattg 
ctttatggta 
aattcagcaa 



ctcagtgaaa 
ggctaactct 
ttaaaattta 
tttgttcata 
atcatagata 
ctgtctacaa 
tcccagctac 
agtgagccat 
tagaaaatga 
cccagatatg 
tgtgtgtata 
gtttagattg 
ctacaatgca 
ctccttttgg 
ttacgttatc 
aatcaaaatt 
tgtaattaac 
atttactcag 
tccaatttgt 
tcaaagacat 
gcaggggaaa 
tatactgagg 
tgaaacagat 
gtaagatatg 
aaattctgtg 
ttttcagaat 
taaaggaaaa 



catgtcaaat 

tacaccatac 

gagtggcaac 

attatattct 

gaaggaaata 

aaaaattaaa 

tcgggagggt 

ggacatacca 

aaaggaaata 

taccacaaaa 

attgcaagcg 

aagattgagt 

atgtcgttgt 

gccagttttc 

acttagtata 

agtactttgg 

tgaatttaaa 

tttagagtag 

gtatgaaaag 

ataccttgtt 

aggttatttt 

gaagtataat 

cagtttttcc 

ttctgcaatt 

tctccagcag 

aaagtctgac 

tatgcatctc 



Met Arg Tyr 
1 

Tyr Val Leu 



Val Leu Leu Gly Thr Ala Pro Thr 

10 15 
Arg Leu Leu Ser Ala Phe Leu Pro 
25 30 
Asp Arg Leu Tyr Cys Val Tyr Gin 
45 

Asn Tyr Thr Gly Val Gin Leu Thr 
60 

Ser Gly Arg Met Arg 
75 



<221> 
<222> 
<223> Box 
<400> 126 

Met Arg Tyr Leu Leu Pro Ser Val 
1 5 
Tyr Val Leu Ala Trp Gly Val Trp 
20 

Ala Arg Phe Tyr Gin Ala Leu Asp 



Val Leu Leu Gly Thr Ala Pro Thr 

10 15 
Arg Leu Leu Ser Ala Phe Leu Pro 
25 30 
Asp Arg Leu Tyr Cys Val Tyr Gin 
35 40 45 

Ser Met Val L^u Phe Phe Phe Glu Asn Tyr Thr Gly Val Gin His Gly 

50 55 60 

Gly lie Tyr V^l Lys Arg Ser Ala Lys Phe Asn Glu Lys Glu Met Arg 



3758 

3818 

3878 

3938 

3998 

4058 

4118 

4178 

4238 

4298 

4358 

4418 

4478 

4538 

4598 

4658 

4718 

4778 

4838 

4898 

4958 

5018 

5078 

5138 

5198 

5258 

5318 

5324 
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65 70 75 80 

Asn Lys Leu Gin Ser Tyr Val Asp Ala Gly Thr Pro Met Tyr Leu Val 

i 85 90 95 

lie Phe Pro Glu Gly Thr Arg Tyr Asn Pro Glu Gin Thr Lys Val Leu 

ibo 105 110 

Ser Ala Ser Gin Ala Phe Ala Ala Gin Arg Glu Phe Leu Cys Lys Glu 

115 I 120 125 

Cys Pro Lys lie His He His He Asp Arg He Asp Lys Lys Asp Val 

130 135 140 

Pro Glu Glu Gin Glu His Met Arg Arg Trp Leu His Glu Arg Phe Glu 
145 150 155 160 

He Lvs Asp Lys Met Leu He Glu Phe Tyr Glu Ser Pro Asp Pro Glu 

165 170 175 

Arg Arg Lys Arg Phe Pro Gly Lys Ser Val Asn Ser Lys Leu Ser He 

180 185 190 

Lys Lys Thr Leu Pro Ser Met LdS He Leu Ser Gly Leu Thr Ala Gly 

195 200 205 

Met Leu Met Thr Asp Ala Gly Arg Lys Leu Tyr Val Asn Thr Trp He 

210 215 220 

Tyr Gly Thr Leu Leu Gly Cys Leu Trp Val Thr He Lys Ala 
225 230 235 

<210> 12.7 
<211> 291 
<212> PRT 
<213> Homo sapiens 
<220> ' 
<221> SITE 
<222> 98. .103| 
<223> Box II 
<221> SITE 
<222> 149. .ISp 
<223> Box Ilk 
<400> 127 

Met Arg Tyr Leu Leu Pro Ser Val Val Leu Leu Gly Thr Ala Pro Thr 

1 5 10 15 

Tyr Val Leu Ala Trp Gly Val Trp Arg Leu Leu Ser Ala Phe Leu Pro 

20 25 30 

Ala Arg Phe Tyr Gin Ala Leu Asp Asp Arg Leu Tyr Cys Val Tyr Gin 

35 40 45 

Ser Met Val Leu Phe Phe Phe Glu Asn Tyr Thr Gly Val Gin His Gly 

50 55 60 

Gly He Tyr Val Lys Arg Ser Ala Lys Phe Asn Glu Lys Glu Met Arg 
65 70 75 80 

Asn Lys Leu Gin Ser Tyr Val Asp Ala Gly Thr Pro Met Tyr Leu Val 

85 90 95 

He Phe Pro Glu Gly Thr Arg Tyr Asn Pro Glu Gin Thr Lys Val Leu 

100 105 110 

Ser Ala Ser Gin Ala Phe Ala Ala Gin Arg Gly Leu Ala Val Leu Lys 

115 120 125 

His Val Leu Thr Pro Arg He Lys Ala Thr His Val Ala Phe Asp Cys 

130 i 135 140 

Met Lys Asn Tyr Leu Asp Ala He Tyr Asp Val Thr Val Val Tyr Glu 
145 150 155 160 

Gly Lys Asp Afep Gly Gly Gin Arg Arg Glu Ser Pro Thr Met Thr Glu 

165 170 175 

Phe Leu Cys Lys Glu Cys Pro Lys He His He His He Asp Arg He 

IBO 185 190 

Asp Lys Lys Asp Val Pro Glu Glu Gin Glu His Met Arg Arg Trp Leu 

195 200 205 

His Glu Arg Phe Glu He Lys Asp Lys Met Leu He Glu Phe Tyr Glu 
210 215 220 
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Ser Pro Asp Pro Glu Arg Arg Lys Arg Phe Pro Gly Lys Ser Val Asn 
225 230 235 240 

Ser Lys Leu Ser lie Lys Lys Thr Leu Pro Ser Met Leu lie Leu Ser 

245 250 255 

Gly Leu Thr Ala Gly Met Leu Met Thr Asp Ala Gly Arg Lys Leu Tyr 

260 265 270 

Val Asn Thr Trp lie Tyr Gly Thr Leu Leu Gly Cys Leu Trp Val Thr 

275 280 285 

lie Lys Ala 

290 
<210> 128 
<211> 261 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> 68. .73 
<223> Box II 
<221> SITE 
<222> 119. .127 
<223> Box III 
<400> 12.8 

Met Arg Tyr Leu Leu Pro Ser Val Val Leu Leu Gly Thr Ala Pro Thr 

1 ; 5 10 15 

Tyr Val Leu Ala Trp Gly Val Trp Arg Leu Leu Ser Ala Phe Leu Pro 

2b 25 30 

Ala Arg Phe Tyr Gin Ala Leu Asp Asp Arg Leu Tyr Cys Val Tyr Gin 

35 40 45 

Ser Met Val Leu Phe Phe Phe Glu Asn Tyr Thr Gly Val Gin Met Tyr 

50 55 60 

Leu Val He Phe Pro Glu Gly Thr Arg Tyr Asn Pro Glu Gin Thr Lys 
65 70 75 80 

Val Leu Ser Ala Ser Gin Ala Phe Ala Ala Gin Arg Gly Leu Ala Val 

85 90 95 

Leu Lys His Val Leu Thr Pro Arg He Lys Ala Thr His Val Ala Phe 

100 105 110 

Asp Cys Met Lys Asn Tyr Leu Asp Ala He Tyr Asp Val Thr Val Val 

115 120 125 

Tyr Glu Gly Lys Asp Asp Gly Gly Gin Arg Arg Glu Ser Pro Thr Met 

130 ' 135 140 

Thr Glu Phe Leu Cys Lys Glu Cys Pro Lys He His He His He Asp 
145 150 155 160 

Arg He Asp Lys Lys Asp Val Pro Glu Glu Gin Glu His Met Arg Arg 

j 165 170 175 

Trp Leu His GiLu Arg Phe Glu He Lys Asp Lys Met Leu He Glu Phe 

ibo 185 190 

Tyr Glu Ser Pro Asp Pro Glu Arg Arg Lys Arg Phe Pro Gly Lys Ser 

195 I 200 205 

Val Asn Ser Lys Leu Ser He Lys Lys Thr Leu Pro Ser Met Leu He 

210 1 215 220 

Leu Ser Gly Leu Thr Ala Gly Met Leu Met Thr Asp Ala Gly Arg Lys 
225 ] 230 235 240 

Leu Tyr Val Asn Thr Trp He Tyr Gly Thr Leu Leu Gly Cys Leu Trp 

245 250 255 

Val Thr He Lys Ala 
260 

<210> 129 
<211> 90 
<212> PRT 

<213> Homo sapiens 
<400> 129 
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Met Arg Tyr 


Leu 


Leu Pro Ser 


Val 


Val 


Leu 


Leu Gly Thr Ala 


Pro 


Thr 






5 






10 




15 




Tyr Val Leu 


Ala 


Trp Gly Val 


Trp 


Arg 


Leu 


Leu Ser Ala Phe 


Leu 


Pro 


20 






25 




30 






Ala Arg Phe 


Tyr 


Gin Ala Leu 


Asp 


Asp 


Arg 


Leu Tyr Cys Val 


Tyr 


Gin 


35 






40 






45 






Ser Met Val 


Leu 


Phe Phe Phe Glu Asn Tyr 


Thr Gly Val Gin Asn Phe 


50 




55 








60 






Ser Ala Lys 


Asn 


Val Gin Lys 


Phe 


He 


Phe 


Thr Leu He Val 


Ser 


Thr 


65 




70 








75 




80 


Lys Lys Met 


Ser 


Gin Lys Asn 


Lys 


Asn 


He 









85 90 
<210> 130 
<211> 68 
<212> PRT . 
<213> Homo sapiens 
<400> 130 ] 

Met Arg Tyr Leu Leu Pro Ser Val Val Leu Leu Gly Thr Ala Pro Thr 

lis 10 15 

Tyr Val Leu Ala Trp Gly Val Trp Arg Leu Leu Ser Ala Phe Leu Pro 

20 25 30 

Ala Arg Phe Tyr Gin Ala Leu Asp. Asp Arg Leu Tyr Cys Val Tyr Gin 

35 40 45 

Ser Met Val Leu Phe Phe Phe Glu Asn Tyr Thr Gly Val Gin Asp Ala 

50 55 60 

Tyr Arg He Leu 
65 

<210> 131 
<211> 66 
<212> PRT 

<213> Homo sapiens 
<400> 131 

Met Arg Tyr Leu Leu Pro Ser Val Val Leu Leu Gly Thr Ala Pro Thr 

15 10 15 

Tyr Val Leu Ala Trp Gly Val Trp Arg Leu Leu Ser Ala Phe Leu Pro 

20 25 30 

Ala Arg Phe Tyr Gin Ala Leu Asp Asp Arg Leu Tyr Cys Val Tyr Gin 

35 40 45 

Ser Met Val Leu Phe Phe Phe Glu Asn Tyr Thr Gly Val Gin Arg Leu 

50 55 60 

Asp Ser 
65 

<210> 132 
<211> 97 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> 81. .83 
<223> Box I 
<400> 132 

Met Arg Tyr Leu Leu Pro Ser Val Val Leu Leu Gly Thr Ala Pro Thr 

15 10 15 

Tyr Val Leu Ala Trp Gly Val Trp Arg Leu Leu Ser Ala Phe Leu Pro 

20 25 30 

Ala Arg Phe Tyr Gin Ala Leu Asp Asp Arg Leu Tyr Cys Val Tyr Gin 

35 40 45 

Ser Met Val Leu Phe Phe Phe Glu Asn Tyr Thr Gly Val Gin He Leu 

50 55 . 60 

Leu Tyr Gly Asp Leu Pro Lys Asn Lys Glu Asn He He Tyr Leu Ala 
65 70 75 80 
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Asn His Gin Ser Thr Asp Val Ser Cys Asp Phe Ser Arg Arg Tyr Lys 
85 90 95 

Val 

<210> 133 
<211> 182 
<212> PRT 

<213> Homo sabiens 
<220> ' 
<221> SITE 
<222> 81. .83 
<223> Box I 
<400> 133 

Met Arg Tyr L^u Leu Pro Ser Val Val Leu Leu Gly Thr Ala Pro Thr 

1 j 5 10 15 

Tyr Val Leu Ala Trp Gly Val Trp Arg Leu Leu Ser Ala Phe Leu Pro 

20 25 30 

Ala Arg Phe Tyr Gin Ala Leu Asp Asp Arg Leu Tyr Cys Val Tyr Gin 

35 40 45 

Ser Met Val Leu Phe Phe Phe Glu Asn Tyr Thr Gly Val Gin He Leu 

50 55 60 

Leu Tyr Gly Asp Leu Pro Lys Asn Lys Glu Asn He He Tyr Leu Ala 
65 70 75 . 80 

Asn His Gin Ser Thr Val Asp Trp He Val Ala Asp He Leu Ala He 

85 90 95 

Arg Gin Asn Ala Leu Gly His Val Arg Tyr Val Leu Lys Glu Gly Leu 

100 105 110 

Lys Trp Leu Pro Leu Tyr Gly Cys Tyr Phe Ala Gin His Gly Gly He 

115 120 125 

Tyr Val Lys Arg Ser Ala Lys Phe Asn Glu Lys Glu Met Arg Asn Lys 

130 135 140 

Leu Gin Ser Tyr Val Asp Ala Gly Thr Pro Asn Phe Ser Ala Lys Asn 
145 j 150 155 160 

Val Gin Lys pjie He Phe Thr Leu He Val Ser Thr Lys Lys Met Ser 

; 165 170 175 

Gin Lys Asn Lys Asn He 
180 

<210> 134 
<211> 315 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> 81. .83 
<223> Box I 
<221> SITE 
<222> 160. .165 
<223> Box II 
<400> 134 

Met Arg Tyr Leu Leu Pro Ser Val Val Leu Leu Gly Thr Ala Pro Thr 

15 10 15 

Tyr Val Leu Ala Trp Gly Val Trp Arg Leu Leu Ser Ala Phe Leu Pro 

20 25 30 

Ala Arg Phe Tyr Gin Ala Leu Asp Asp Arg Leu Tyr Cys Val Tyr Gin 

35 40 45 

Ser Met Val Leu Phe Phe Phe Glu Asn Tyr Thr Gly Val Gin He Leu 

50 55 60 

Leu Tyr Gly Asp Leu Pro Lys Asn Lys Glu Asn He He Tyr Leu Ala 
65 70 75 80 

Asn His Gin Sfer Thr Val Asp Trp He Val Ala Asp He Leu Ala He 

85 90 95 

Arg Gin Asn ADLa Leu Gly His Val Arg Tyr Val Leu Lys Glu Gly Leu 
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ibo 105 110 

Lys Trp Leu Pro Leu Tyr Gly Cys Tyr Phe Ala Gin His Gly Gly He 

115 I 120 125 

Tyr Val Lys Airg Ser Ala Lys Phe Asn Glu Lys Glu Met Arg Asn Lys 

130 135 140 

Leu Gin Ser Tyr Val Asp Ala Gly Thr Pro Met Tyr Leu Val He Phe 
145 1 150 155 160 

Pro Glu Gly Tkr Arg Tyr Asn Pro Glu Gin Thr Lys Val Leu Ser Ala 

165 170 175 

Ser Gin Ala Phe Ala Ala Gin Arg Gly Lys Asp Asp Gly Gly Gin Arg 

180 185 190 

Arg Glu Ser Pro Thr Met Thr Glu Phe Leu Cys Lys Glu Cys Pro Lys 

195 200 205 

He His He His He Asp Arg He Asp Lys Lys Asp Val Pro Glu Glu 

210 215 220 

Gin Glu His Met Arg Arg Trp Leu His Glni Arg Phe Glu He Lys Asp 
225 230 235 240 

Lys Met Leu He Glu Phe Tyr Glu Ser Pro Asp Pro Glu Arg Arg Lys 

245 250 255 

Arg Phe Pro Gly Lys Ser Val Asn Ser Lys Leu Ser He Lys Lys Thr 

260 265 270 

Leu Pro Ser Met Leu He Leu Ser Gly Leu Thr Ala Gly Met Leu Met 

275 280 285 

Thr Asp Ala Gly Arg Lys Leu Tyr Val Asn Thr Trp He Tyr Gly Thr 

290 295 300 

Leu Leu Gly ci^s Leu Trp Val Thr He Lys Ala 
305 ' 310 315 

<210> 135 
<211> 300 
<212> PRT 
<213> Homo sapiens 
<220> 

<221> SITE 
<222> 81. .83 
<223> Box I 
<221> SITE 
<222> 160. ,165 
<223> Box II 
<400> 135 

Met Arg Tyr Leu Leu Pro Ser Val Val Leu Leu Gly Thr Ala Pro Thr 

15 10 15 

Tyr Val Leu Ala Trp Gly Val Trp Arg Leu Leu Ser Ala Phe Leu Pro 

20 25 30 

Ala Arg Phe Tyr Gin Ala Leu Asp Asp Arg Leu Tyr Cys Val Tyr Gin 

35 40 45 

Ser Met Val Leu Phe Phe Phe Glu Asn Tyr Thr Gly Val Gin He Leu 

50 55 60 

Leu Tyr Gly Asp Leu Pro Lys Asn Lys Glu Asn He He Tyr Leu Ala 
65 70 75 80 

Asn His Gin Ser Thr Val Asp Trp He Val Ala Asp He Leu Ala He 

85 90 95 

Arg Gin Asn Alia Leu Gly His Val Arg Tyr Val Leu Lys Glu Gly Leu 

ipo 105 110 

Lys Trp Leu Pto Leu Tyr Gly Cys Tyr Phe Ala Gin His Gly Gly He 

115 I 120 125 

Tyr Val Lys Arg Ser Ala Lys Phe Asn Glu Lys Glu Met Arg Asn Lys 

130 j 135 140 

Leu Gin Ser Tyr Val Asp Ala Gly Thr Pro Met Tyr Leu Val He Phe 
145 I 150 155 160 

Pro Glu Gly Thr Arg Tyr Asn Pro Glu Gin Thr Lys Val Leu Ser Ala 
165 170 175 
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Ser Gin Ala Phe Ala Ala Gin Arg Glu Phe Leu Cys Lys Glu Cys Pro 

IBO 185 190 

Lys lie His He His He Asp Arg He Asp Lys Lys Asp Val Pro Glu 

195 200 205 

Glu Gin Glu His Met Arg Arg Trp Leu His Glu Arg Phe Glu He Lys 

210 215 220 

Asp Lys Met Leu He Glu Phe Tyr Glu Ser Pro Asp Pro Glu Arg Arg 
225 230 235 240 

Lys Arg Phe Pro Gly Lys Ser Val Asn Ser Lys Leu Ser He Lys Lys 

245 250 255 

Thr Leu Pro Ser Met Leu He Leu Ser Gly Leu Thr Ala Gly Met Leu 

260 265 270 

Met Thr Asp Ala Gly Arg Lys Leu Tyr Val Asn Thr Trp He Tyr Gly 

275 280 285 

Thr Leu Leu Gly Cys Leu Trp Val Thr He Lys Ala 
290 -295 30a 

<210> 136 
<211> 185 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> SITE 
<222> 81. .83 
<223> Box I 
<221> SITE 
<222> 160. .16^ 
<223> Box Hi 
<400> 136 

Met Arg Tyr Leu Leu Pro Ser Val Val Leu Leu Gly Thr Ala Pro Thr 

15 10 15 

Tyr Val Leu Ala Trp Gly Val Trp Arg Leu Leu Ser Ala Phe Leu Pro 

20 25 30 

Ala Arg Phe Tyr Gin Ala Leu Asp Asp Arg Leu Tyr Cys Val Tyr Gin 

35 40 45 

Ser Met Val Leu Phe Phe Phe Glu Asn Tyr Thr Gly Val Gin He Leu 

50 55 60 

Leu Tyr Gly Asp Leu Pro Lys Asn Lys Glu Asn He He Tyr Leu Ala 
65 70 75 80 

Asn His Gin Ser Thr Val Asp Trp He Val Ala Asp He Leu Ala He 

85 90 95 

Arg Gin Asn Ala Leu Gly His Val Arg Tyr Val Leu Lys Glu Gly Leu 

100 105 110 

Lys Trp Leu Pro Leu Tyr Gly Cys Tyr Phe Ala Gin His Gly Gly He 

115 120 125 

Tyr Val Lys Arg Ser Ala Lys Phe Asn Glu Lys Glu Met Arg Asn Lys 

130 I 135 140 

Leu Gin Ser Tyr Val Asp Ala Gly Thr Pro Met Tyr Leu Val He Phe 
145 j 150 155 160 

Pro Glu Gly Thr Arg Tyr Asn Pro Glu Gin Thr Lys Val Leu Ser Ala 

; 165 170 175 

Ser Gin Ala Phe Ala Ala Gin Arg Gly 
IBO 185 

<210> 137 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> misc_binding 
<222> 1, .19 

<223> amplification oligonucleotide PGlASel3 
<400> 137 
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accggggtcc agttgactg 
<210> 138 
<211> 17 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> misc.binding 
<222> 1. .17 

<223> amplification oligonucleotide PGlASel4 
<400> 138 

cggggtccag catggag 
<210> 139 
<211> 16 
<212> DNA , 
<213> Homo Sapiens 

<220> 'i . 

<221> misc_bihding 
<222> 1. .16 ; 

<223> amplification oligonucleotide PGlASelS 

<400> 139 . S 

ccggggtcca ggfcctt 

<210> 140 

<211> 16 

<212> DNA 

<213> Homo Sapiens 
<220> 

<221> misc_binding 
<222> 1. .16 

<223> amplification oligonucleotide PGlASel6 

<400> 140 

cggggtccag gccttg 

<210> 141 

<211> 21 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> misc_binding 
<222> 1. .21 : 

<223> amplification oligonucleotide PGlASel7 

<400> 141 1 

accggggtcc ag^iatttctc t 

<210> 142 ' 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> i 

<221> misc_bihding 

<222> 1. .19 

<223> amplification oligonucleotide PGlASelS 
<400> 142 

cggggtccag gatgcttat 
<210> 143 
<211> 23 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> misc^binding 
<222> 1. .23 

<223> amplification oligonucleotide PGlASe24 
<400> 143 

aatcatcaaa gcacagcatg gag 
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<210> 144 
<211> 28 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> misc_binding 
<222> 1. .28 

<223> amplification oligonucleotide PGlASe25 
<400> 144 

caaatcatca aa^cacagat gtatcttg 
<210> 145 ' 
<211> 20 
<212> DNA 

<213> Homo Sapiens 
<220> ■ 
<221> misc_bihding 
<222> 1. .20 

<223> amplification oligonucleotide PGlASe26 
<400> 145 

atcaaagcac aggccttgca 
<210> 146 
<2I1> 26 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> misc_binding 
<222> 1. .26 

<223> amplification oligonucleotide PGlASe27 
<400> 146 

agcaaatcat caaagcacag aatttc 

<210> 147 

<211> 28 

<212> DNA 

<213> Homo Sapiens 

<220> . 

<221> misc_bifiding 

<222> 1. .28 

<223> amplification oligonucleotide PGlASe28 
<400> 147 

atcatcaaag capaggatgc ttatagaa 
<210> 148 
<211> 31 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> misc„binding 
<222> 1..31 

<223> amplification oligonucleotide PGlASe35 
<400> 148 

gtgttacttt gctcagatgt atcttgtgat t 

<210> 149 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> misc_binding 
<222> 1. .23 . 

<223> amplification oligonucleotide PGlASe36 

<400> 149 j 

tactttgctc aggccttgca gta 

<210> 150 
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<211> 27 
<212> DNA . 
<213> Homo Sapiens 
<220> I 
<221> misc_bikding 
<222> 1..27 i 

<223> ainplif ipation oligonucleotide PGlASe37 
<400> 150 ' 

gggtgttact ttgctcagaa tttctct 
<210> 151 
<211> 29 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> inisc_binding 

<222> l.:29 ^" 

<223> amplification oligonucleotide PGlASe3 8 

<400> 151 

ggtgttactt tgctcaggat gcttataga 
<210> 152 
<211> 20 

<212> DNA . , 

<213> Homo Sapiens 
<220> 

<221> miscbinding 
<222> 1. .20 

<223> amplification oligonucleotide PGlASe46 

<400> 152 : 

caggaactcc agpcttgcag 

<210> 153 

<211> 23 

<212> DNA 

<213> Homo Sapiens 
<220> 

<221> inisc_bihding 
<222> 1..23 

<223> amplification oligonucleotide PGlASe47 
<400> 153 

caggaactcc aaatttctct gca 

<210> 154 

<211> 25 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> misc_binding 
<222> 1..25 

<223> amplification oligonucleotide PGlASe48 
<400> 154 

cgcaggaact ccagatgctt ataga 

<210> 155 

<211> 22 

<212> DNA J 

<213> Homo Sapiens 

<220> j 

<221> misc_biiiding 

<222> 1. .22 

<223> amplif ipation oligonucleotide PGlASe57 
<400> 155 I 
ctgcccaacg tgaatttctc tg 
<210> 156 
<211> 22 
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<212> DNA 

<213> Homo Sapiens 
<220> i 
<221> misc^bihding 
<222> 1. .22 

<223> anplif ication oligonucleotide PGlASe58 
<400> 156 

gcccaacgtg gatgcttata ga 
<210> 157 
<211> 23 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> misc_binding 
<222> 1. .23 

<223> amplification oligonucleotide PGlASe68 
<400> 157 

cgaccatgac gggatgctta tag 
<210> 158 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<22p> : 
<221> misc_bihding 
<222> 1. .19 • 

<223> amplification oligonucleotide PGlASelX 

<400> 158 ' } 

ccggggtcca gagattgga 

<210> 159 

<211> 26 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> misc_binding 
<222> 1. .26 

<223> amplification oligonucleotide PGlASeX2 
<400> 159 

aaagtggaag gccctcttta acaata 
<210> 160 
<211> 25 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> misc_binding 
<222> 1. .25 

<223> amplification oligonucleotide PGlAelb3 
<400> 160 I 

gccctcttta acattgactg gattg 
<210> 161 ' 
<211> 24 
<212> DNA 

<213> Homo Sapiens 
<220> I 
<221> misc.binding 
<222> 1. .24 ^ 

<223> amplification oligonucleotide PGlAelb4 
<400> 161 

gccctcttta acacatggag gaat 
<210> 162 
<211> 28 
<212> DNA 
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<213> Homo Sapiens 
<220> 

<221> misc_binding 
<222> 1..28 

<223> ainplification oligonucleotide PGlAelbS 
<400> 162 

ggccctcttt aacaatgtat cttgtgat 
<210> 163 
<211> 25 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> misc_binding 
<222> 1. .25 i 

<223> amplif ibation oligonucleotide PGlAelb6 
<400> 163 I 

gccctcttta acagccttgc agtat 
<210> 164 
<211> .25 
<212> DNA 

<213> Horno Sapiens 
<220>' ..i; . ■ 
<221> misc_binding 
<222> 1. .25 

<223> amplification oligonucleotide PGlAelb? 
<400> 164 

ggccctcttt aacaaatttc tctgc 
<210> 165 
<211> 28 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> inisc_binding . 
<222> 1. .28 

<223> amplification oligonucleotide PGlAelbS 
<400> 165 

gaaggccctc tttaacagat gcttatag 

<210> 166 

<211> 26 

<212> DNA 

<213> Homo Sapiens 

<220> I 

<221> misc_binding 

<222> 1, .26 ! 

<223> amplification oligonucleotide PGlAe3b4 
<400> 166 \ 

atgctggatt atagcatgga ggaatc 

<210> 167 

<211> 31 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> misc_binding 
<222> 1. .31 

<223> amplification oligonucleotide PGlAe3b5 
<400> 167 

caaaatgctg gattatagat gtatcttgtg a 
<210> 168 
<211> 23 
<212> DNA 

<213> Homo Sapiens 
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<220> 

<221> raisc_binding 
<222> 1. .23 

<223> amplification oligonucleotide PGlAe3b6 
<400> 168 

tgctggatta taggccttgc agt 
<210> 169 
<211> 28 
<212> DNA 

<213> Homo Sapiens 
<220> ! 
<221> misc_bihding 
<222> 1. .28 : 

<223> amplification oligonucleotide PGlAe3b7 
<400> 169 

tgctggatta tagaatttct ctgcaaag 

<210> 170 

<211> 30 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> misc_binding 
<222> 1. .30 ^ ' " 

<223> amplification oligonucleotide PGlAe3bB 
<400> 170 

ccaaaatgct ggattatagg atgcttatag 

<210> 171 

<211> 21 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> misc_biriding 
<222> 1. .21 \ 

<223> autcpli fixation oligonucleotide PGlAe5b6 

<400> 171 i 

tatctttgca tggcagcctt g 

<210> 172 ■ 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> ! 

<221> misc_binding 

<222> 1..23 

<223> amplification oligonucleotide PGlAe5b7 
<400> 172 

ctttgcatgg caaatttctc tgc 

<210> 173 

<211> 27 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> misc__binding 
<222> 1. .27 

<223> amplification oligonucleotide PGlAeSbS 
<400> 173 

ttatctttgc atggcagatg cttatag 

<210> 174 

<211> 20 i 

<212> DNA I 

<213> Homo Sapiens 

<220> • 
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<221> inisc„binding 
<222> 1. .20 i 

<223> amplification oligonucleotide PGlAe56b 

<400> 174 j 

ctgcccaacg tgggaaagac 

<210> 175 

<211> 21 

<212> DNA 

<213> Homo Sapiens 
<220> 

<221> misc_binding 
<222> 1. .21 

<223> amplification oligonucleotide PGlAe46b 
<400> 175 

gcaggaactc caggaaagac g 

<210> 176 ■ 

<211> 25 

<212> DNA. 

<213> Homo Sapiens 

<220> 

<221> misc_binding 

<222> .1. ..25 

<223> amplification oligonucleotide PGlAe36b 

<400> 176. 

tgttactttg ctcagggaaa gacga 
<210> 177 
<211> 22 
<212> DNA 

<213> Homo Sapiens 
<220> i 
<221> misc_bihding 
<222> 1, .22 

<223> amplification oligonucleotide PGlAe26b 

<400> 177 : 

atcaaagcac agggaaagac ga 

<210> 178 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> misc_binding 
<222> 1. .19 

<223> amplification oligonucleotide PGlAel6b 
<400> 178 

ccggggtcca gggaaagac 
<210> 179 
<211> 56520 
<212> DNA 

<213> Homo sapiens 
<220> 

<221> exon 

<222> 2001. .21216 

<223> exonl ' 

<221> exon 

<222> 18196. .|l8265 

<223> expn2 • 

<221> exon 

<222> 23716. .j23831 

<223> exon3 ^ 

<221> exon 

<222> 25570. .25659 
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<223> exon4 

<221> exon 

<222> 34668.-34758 

<223> exonS 

<221> exon 

<222> 40685. .40843 

<223> exon6 

<221> exon 

<222> 48067. .48190 

<223> exon7 

<221> exon 

<222> 50179. .54519 

<223> exon8 

<221> polyA^signal 

<222> 54493 . .54498 

<223> AATAAA 

<221> primer„bind 

<222> 1991. .2008 

<223> upstream aiirplif ication primer 5-63 
<221> primer„bind 
<222> 2505. .2525 

<223=>. downstream amplification primer 5-63 . , complement 
<221> primer_bind 
<222> 4091. .4111 

<223> downstream amplification primer 99-622 
<221> primer_bind 
<222> 4528. .4546 

<223> upstream amplification primer 99-622 , complement 
<221> primer_bind 
<222> 5475. .5495 

<223> downstream amplification primer 99-621 
<221> primer_bind 
<222> 5927. .5947 

<223> upstream amplification primer 99-621 , complement 
<221> primer„bind 
<222> 8127. .8144 

<223> downstream amplification primer 99-619 
<221> primer_bind 
<222> 8560. .8578 

<223> upstream amplification primer 99-619 , complement 
<221> primer_bind 
<222> 11622. ,11639 

<223> upstream amplification primer 4-76 
<221> primer_bind 
<222> 12018. .jL2037 

<223> downstrfeam amplification primer 4-76 , complement 
<221> primer_bind 
<222> 11930. .iL1947 

<223> upstreak amplification primer 4-77 
<221> primer^bind 
<222> 12339. .12358 

<223> downstream amplification primer 4-77 , complement 
<221> primer_bind 
<222> 12915. .12932 

<223> upstream amplification primer 4-71 
<221> primer „bind 
<222> 13317. .13334 

<223> downstream amplification primer 4-71 , complement 
<221> primer_bind 
<222> 13216. .13233 

<223> upstream amplification primer 4-72 
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<221> primer_bind 
<222> 13617. .13636 

<223> dovmstreain amplification primer 4-72 , complement 
<221> primer„bind 
<222> 13547. .13564 

<223> upstream amplification primer 4-73 
<221> primer_bind 
<222> 13962. .13981 

<223> dovmstreeun amplification primer 4-73 , complement 
<221> primer„bind 
<222> 15994. .16011 

<223> downstream amplification primer 99-610 
<221> primer_bind 
<222> 16463. .16480 

<223> upstream amplification primer 99-610 , complement 
<221> primer_bind 
<222> 17304. .-17324 

<223> dovmstrieam amplification primer 99-609 
<221> primer_bind 
<222> 17814. .17832 

<223> upstream amplification primer 99-609 , complement 
<221> priraer_bind 
<222> 18008. ,18027 

<223> upstream amplification primer 4-90 
<221> primer_bind 
<222> 18423. .18442 

<223> downstream amplification primer 4-90 , complement 
<221> primer_bind 
<222> 18699. .18716 

<223> downstream amplification primer 99-607 
<221> primer_bind 
<222> 19164. .19182 

<223> upstream amplification primer 99-507 , complement 
<221> primer_bind 
<222> 22589. .22609 

<223> downstream amplification primer 99-602 
<221> primer_bind 
<222> 23111. ,23129 

<223> upstream amplification primer 99-602 , complement 
<221> prime r_bind 
<222> 25098. .125118 

<223> downstrjsam amplification primer 99-600 
<221> primer_bind 
<222> 25657. .25674 

<223> upstreaia amplification primer 99-600 , complement 
<221> primer_)Dind 
<222> 26537. .26557 

<223> downstream air^lif ication primer 99-598 
<221> primer_bind 
<222> 27022. .27040 

<223> upstream eimplif ication primer 99-598 , complement 
<221> primer_bind 
<222> 32262. .32281 

<223> downstream amplification primer 99-592 
<221> primer_bind 
<222> 32823. .32841 

<223> upstream amplification primer 99-592 , complement 
<221> primer_bind 
<222> 34215, .34233 

<223> upstream amplification primer 99-217 
<221> prime r_bind 
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<222> 34624. .34644 

<223> downstream amplification primer 99-217 , complement 
<221> primer_bind 
<222> 34473.. 34491 

<223> upstream amplification primer 5-47 
<221> primer_bind 
<222> 34916. .34936 

<223> downstream amplification primer 5-47 , complement 
<221> primer_jDind 
<222> 34702. .34722 

<223> downstream amplification primer 99-589 
<221> primer_iDind 
<222> 35182. .35200 

<223> upstream amplification primer 99-589 , complement 
<221> primer^bind . 
<222> 39591. .39611 

<223> upstream amplification primer 99-12899 
<221> primer_bind 
<222>. 39.971. .39991 

<223> downstream airplif ication primer 99-12899 , complement 
<221> primer_bind 
<222> 40*531. .40549 

<223> upstream amplification primer 4-12 
<221> primer_bind 
<222> 40932. .40950 

<223> downstream amplification primer 4-12 , coir^lement 
<221> primer_bind 
<222> 40629. .40649 

<223> downstream amplification primer 99-582 
<221> primer„bind 
<222> 41058.. 41078 

<223> upstream amplification primer 99-582 , complement 
<221> primer_bind 
<222> 45729.-^5746 

<223> downstream amplification primer 99-576 
<221> primer^bind 
<222> 46186.. ^6203 

<223> upstream amplification primer 99-576 , complement 
<221> primer_bind 
<222> 47879. .47896 

<223> upstream amplification primer 4-13 
<221> primer_bind 
<222> 48217. .48236 

<223> downstream amplification primer 4-13 , complement 
<221> primer_bind 
<222> 48902. .48922 

<223> upstream amplification primer 99-12903 
<221> primer _bind 
<222> 49331.. 49351 

<223> downstream amplification primer 99-12903 , complement 
<221> primer„bind 
<222> 49830. .49848 

<223> upstream amplification primer 5-56 
<221> primer_„bind 
<222> 50271. .50290 

<223> downstream amplification primer 5-56 , complement 
<221> primer_bind 
<222> 50172. .50189 

<223> upstream amplification primer 4-61 
<221> primer_bind 
<222> 50573 . .&0591 



wo 99/32644 



98 



PCT/IB98/02133 



<223> downstream amplification primer 4-61 , complement 
<221> primer„bind 
<222> 50541. .50560 

<223> upstream amplification primer 4-62 
<221> primer_bind 
<222> 50940. .50959 

<223> downstream amplification primer 4-62 , complement 
<221> primer„bind 
<222> 50555. .^0572 

<223> upstream amplification primer 4-63 
<221> primer_bind 
<222> 50964. .50983 

<223> downstream amplification primer 4-63 , complement 
<221> primer_bind 
<222> 50774. .50792 

<223> upstream amplification primer 4-64'"' 
<221> primer„bind 
<222> 51183. .51202 

<223> downstream amplification primer 4-64 , complement 
<221> primer^bind 
<222> 51146. .51165 

<22.3> upstream amplification primer 4-65, 
<221> primer_bind 
<222> 51479. ,51496 

<223> downstream amplification primer 4-65 , complement 
<221> primer„bind 
<222> 51593. .51610 

<223> upstream amplification primer 4-67 
<221> primer^bind 
<222> 29734. .29744 

<223> upstream amplification primer 4-67 , complement 
<221> primer_bind 
<222> 51167. .51185 

<223> upstream amplification primer 5-50 
<221> primer„bind 
<222> 51667. .51687 

<223> downstream amplification primer 5-50 , complement 
<221> primer_bind 
<222> 51387. .51403 

<223> upstream amplification primer 5-71 
<221> primer_bind 
<222> 51826. .51843 

<223> downstream amplification primer 5-71 , complement 
<221> primer_bind 
<222> 51772. .51789 

<223> upstream amplification primer 5-3 0 
<221> primer„bind 
<222> 52199, .52217 

<223> downstream air^lif ication primer 5-30 , complement 
<221> primer„bind 
<222> 51850. .51867 

<223> upstream amplification primer 5-58 
<221> primer_bind 
<222> 52382. .52400 

<223> downstream amplification primer 5-58 , complement 
<221> primer_„bind 
<222> 52507. .52527 

<223> upstream amplification primer 5-53 
<221> primer_bind 
<222> 52997, .53017 

<223> downstream amplification primer 5-53 , complement 
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<221> priiner_bind 
<222> 52703.. 52721 

<223> upstream amplification primer 5-60 
<221> primer_bind 
<222> 53142 . .53162 

<223> downstream amplification primer 5-60 , 
<221> priiner_bind 
<222> 53001. .53018 

<223> upstream amplification primer 5-68 
<221> primer „bind 
<222> 53521.. 53538 

<223> downstream amplification primer 5-68 , 
<221> prime r_bind 
<222> 53459. .53476 

<223> upstream amplification primer 5-66 
<221> primer_bind 
<222> 53920. .53940 

<223> downstream amplification primer 5-66 , 
<221> primer„bind 
<222> 54202. .54220 

<223> upstream amplification primer 5-62 
. <22;^>.primer_bina . , 

<222> 54681. .54701 
• <223> downstream amplification primer 5-62 , 

<400> 179' \ 

gtggatctgt gactgttcgc aggaagagag gagcgggagc 
gtcaggagct gggtttggag ataaagaggg aacaagagaa 
tggcaaacat tg'cacaaaag tttacaactt cgtgactaac 
aacaaattta cacataaaca catatttact gactttatac 
acagaacct^ ctttatcttt tcgcacactg ttctagtgta 
aagaaagcat aaggagcatt agttgtgcac actgtccaca 
gtactaaacc tagtgcttct tacagtacag ggcaatgaca 
ccttttactg tgtaatgctt cctgctggcc ttcaaatact 
ttcacctggc tttgtcccca aaggtcatca tctaccaatg 
tcatgtataa agaaagtagc taccatcctg gccctgatta 
gtcctgccta aaggtagcac aggtttccat tatggtggtg 
tatatatata tatatatata tatatatatg gtaaagcatt 
actatccttg aaaagggtta catattaaac catttttacc 
agatccaaaa gtcctgtgga tctgctttaa catcaataaa 
gcttttagtg aaggctacaa aagtatgctt tttatggatt 
tttaattact acagaaaaaa acgaggctcc ttattaaaaa 
cagactctga ggaaatgaag caagagtgaa ttctgaaaag 
atatccttgt gggattgttc ttcagctatg cataaacatg 
atggggaaaa acacggaccc taattctgaa acaccctggt 
ggctgctgcg cactcagagc ggaggctgag gaggcggcgt 
gtgagcagat ggagacactc gagctgcccc gcgacctggg 
gcccaggtgc ctgcaagaat tagacctccg ataacgttaa 
taattgtgtg caitcccggcg cccaggggct tgtgagcagc 
tccagcgacc cttaaacctg accgcgcgca cgtccggccc 
acccggaccc tcctccggcc agcacccacc ttcacccagt 
cccttcccgc gtccgcagcc ggcccagctg gggagcatgc 
gcccgcgcca ca'jgcaggtag ctgtactgca actgtcggcc 
cgtgttattg ccgccgaggt ggaactatgg caacgggcga 
tgccgcggag ccccctgccc cggcaggggg atgtggcgat 
gagcatccct gagccatcga tccgggaggg ccgcgggttc 
cggcgcacgc agccccgcac tcgcctaccc ggccccgggc 
tgggggcgga ggctgggagc gggtggcggg cgcggcggcc 
ccgcctgctg gccgcgactg aggcccggga ggcgggcggg 
gccgccgagc tgagaagatg ctgctgtccc tggtgctcca 
tgctgcccag cgtcgtgctc ctgggcacgg cgcccaccta 
ggcggctgct ctccgccttc ctgcccgccc gcttctacca 



complement 



complement 



complement 



complement 



aggacagaca 
agttaagttc 
agtaatctgg 
acagcaatcc 
gagatgtctg 
cccgtgactt 
gccacagaaa 
tgttacttga 
atgttgttat 
gaacttccca 
gtggggaggg 
cggcattctt 
acagccaaag 
acagttatcc 
acacatgtgc 
aaaatcagaa 
gtctaataaa 
taattatcat 
agcgagagac 
ccccttgcaa 
ccgagctgcc 
cacccacttt 
aggtgcgcgt 
gagggagcag 
tccgtcagtc 
gcagtggccg 
caaaccaacc 
ccaatcagaa 
gggtgagggt 
ccttgctttg 
ggcggcgcgg 
cgggcccggg 
gagcgcaggc 
cacgtactcc 
cgtgttggcc 
agcgctggac 



ataactgata 
tgtgttttca 
ggtgattcac 
taacgtgaac 
gtctcagtta 
ttttccacca 
gagagaagct 
gagatctcca 
ttgatgttaa 
ctgaaatacc 
ggcgggaata 
ttaaagtaca 
gggaggagaa 
acccttcgta 
acgcaactac 
acaagtccaa 
cagtatggaa 
cattactgtg 
gggcaggagg 
aggactggca 
tacaacctgg 
ctcactgctc 
tccaggcagc 
aacaagaggc 
gccaccacct 
gagccgggtt 
aatcaagaga 
ggcgcgttgt 
catggggtgt 
ccgccgggag 
cccatgcggc 
cggtgattgg 
ggagctcgct 
atgcgctacc 
tggggggtct 
gaccggctgt 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 

960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
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actgcgtcta ccagagcatg gtgctcttct tcttcgagaa ttacaccggg gtccaggtga 2220 

gccgcctccc gctcccgggt ctcggcgtcc acccgagctc ccgggggcgc ggacctctcc 2280 

gctcccccac agctggcgag ggtcacccgg ccggcccggc ggacccagca cggagagcac 2340 

gtgccgcctc cccgccttcc tctccgcatg cttcctgccg ttctgccgag atcgctctct 2400 

aggaagctgt ggctgcgtcg tcctgaggct acgagtggga cccgccgccc ctttccccgc 2460 

ccctcgcctg ggtctgatgc tgcttagcaa agtgggtgca gatgcacgtt ttaaataata 2520 

gggcacgcgt ttagcagttt ctggcctttg gtccaaagag gtggtcatgt tggaacagat 2580 

cggagacgtc tacactccga agtgcgcttt tacagtgacc tct'tgaaaca gaagtacaat 2640 

tcggtcttgt gttctttccc ctggacaagt gaaagctggg cgaagaaatg aatacatttg 2700 

ttaaccgtag aagcctaact agatacaatt cttgccaact ttaactgggc ttgaatgtgt 2760 

gggtgatctg ttgtctgatt actttctttc tgttactgtt tctctgtaga gattggattc 2820 

gtagattaaa cttgagaaac aaaccataaa agtggaaggc cctctttaac agtaggtatt 2880 

tgaagtgtta taaaaaaaaa aaaggtgaat ttttctttta tttctcagtt tgaaagaaca 2940 

gctttattct tggttattcc taatgtccac ctagtcctct tttacttttc ttggtagggt 3000 

tagggtggca tggggaaatg ggacggtatc attttgtctt tttaactttt tttttttcca 3060 

cctacagcag ctgtttttac cctgtggtca gtcaggtact atatttagtt tgcagttgca 3120 

ctgctgatcg acccttgatg gccccagttg gaagttgttt ggggggaagg aactaggaga 3180 

ggccagggcc tcqatttaaa ccagtgtctg taagtgtctc cttggaagga aaaaaagata 3240 

ctgttccagg tcatggtttc ctggtagttg acgtttaaaa tgggcctcat ttaaaaattt 3300 

caataattca ggctaatttt ttccctttjat atggtaactc caccaagttt gtctaaatgt 3360 

atgattttta tcatgattaa gtttttactt ccacatcatg tgacaactgg cctgggatgg 3420 

gatataagct cagaacacaa ^gtqattcac c.tgttaaaaa aataattcta tctgtggcgg 3480 

gttatgttat ttttgttcaa agaggacaca atatgatgca gaatacacca ttgaaggatt 3540 

ttttggtttg gcaagttctt atttttttaa atggctgtaa aacctagcag tgtttctgaa 3500 

attgcatacc ttacctgatg ttcagagatc cgatttactt cttgatttcc cagcaagtga 3660 

ttttgaaaac atttaatcta atcattcccc ccaccgtctg ttcaaatcaa aggaagtggc 3720 

atccagcact aattttcatg catttatgaa aggatgcctg aggaccctta agtataattc 3780 

aaaattttgt ttaatgtgtg ttccttgatg aagttcttta ggagtcgtag aacgaactga 3840 

ttgcccactg atcatcaaat gcaagttatg aacatttaat aaaaatttaa aaccaagagt 3900 

ttcttgttcc tgcattttta tttttattgt atggagggga caaataatta ttttctgttt 3960 

agtaacagag cagggtattt tgaatttatt agggtctttt tctgcagtct gggtttcctg 402 0. 

tgtacacaaa gctacctttc aatatttttt attgtttctg ttaagattaa atcaatagag 4080 

gaataaatag ctatcttcaa acataagacc caaaggaaaa agatttatag tgatgttctg 4140 

tcaccttatt ttttacctgt gactttgtac cattaacttt gtcactgaga tgttttgatt 4200 

aaaattttta gcttgctttt cttgttttgt taggacactc tttttttctt gaattgtttt 4260 

tatcagcttt cgtttgcaag gctagtgatg attctcttgt tctgtataaa gtattgttga 4320 

ctcatttctg aagggagttt tagtaattta agaggttata agtttttaaa taaaaggttt . 4380 

attaatttat atatattaaa gaggcatttt aaaataaaat tttttttaaa tgacattttt 4440 

acacctttca actctaggtt taaaaaataa gtggttcaca gtagttcttg cagaagaata 4500 

ttttctttta catagaattt ttaagctgaa gagaagtagt agtaggtcca tgagatttat 4560 

gatctgtgct tggcaggtaa acctgcttcc aacaaattta gttggatttt tcttggattc 4620 

tgggtaaata cctttttctt ccccagtttc actactttat tttcatatgt atctctgaga 4680 

. tagagaaata tttcagtcag tgctgctaaa attgttcctt ataactcgtt tatcctttta 4740 

ggtccttcca gaatctctca ttggtactga aactcaaatg ggtactttct tcaccattta 4800 

tttctttaga ataagtaata agaattttat aagctttttt atatttcacg taatttgaga 4860 

ctattgaaaa tccagttaag tctctctact gtgttgagag gcattgattc aagtacctgt 4920 

gttactttcc tg.tgctgcca aaacagatca cctcaaacta agcggcttaa aataatagaa 4980 

cttaagttct cgtgattctg gaggccagca ctttgaaatc aaggtgtagg ctcaatttta 5040 

ctccctctgg aggccctagg gggaatctgt tcttgtgggt ttcaacttct ggtgactggt 5100 

ggcattcctt ggcttggggc cccatcactt caacctctgc cttacagtcc ttgctgccac 5160 

ctcttctgtc tcacatctca ctctcccttt ctcttagaag gatgcttgtc attgggttta 5220 

gagcccacct ggatattccg ggatgatctc ttcatctcaa gatccttaat tataactgca 5280 

aagagccttt ttccaaataa gaaaacattc acaggttcca gggcttagga tgtggacaca 5340 

ttttttgagg ggctgccctt cattccccca caacaatgaa ctccatagtt ctgcctattc 5400 

agtattttgt agttatttcg tagtttaact tgccttattt ctttaggtat ttacgtatta 5460 

aagcattttg gtctctgctt tctttaacag agaacctggt tttctgtaat aagtttactt 5520 

actttcccat aatcttttag tttcttattt acagatttac cttcacatat cccttaagta 5580 

gaacatttga ttaactgttt tattttcgga acaaatctgc attctgtata ataaccaact 5640 

tattcatatt tcggtattct tttaattctt atctgattct gaaattacca tcttgtgatt 5700 

atatatatat atatatggaa ataactgaaa tcttgataaa ttaaaggtga tataacttct 5760 

aagacaatta attatgtatg atgtggtgaa tatactggtg tttggtttgt ttgccactta 5820 
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aaagccctat 
agaggccgta 
aaagttggag 
aaattctatc 
ctatcctagg 
atcgttatga 
ataggatgtc 
ttcccccatt 
taatcaacac 
ccaggctctg 
gtgatcctcc 
gaaatggggt 
tctgccttgg 
ataaatattc 
ttcagaatat 
tgtaatataa 
tgacctgctt 
attttgacat 
gacagaatct 
aacctctgcc 
caggtacaca 
gacgagattt 
cttgatccac 
agaaacagtc 
actgattcat 
tgtaaaagca 
ataagtcatc 
aaatgttgcc 
tgtatcctta 
tcttaaattg 
ttctgcattt 
tcccaatcaa 
ttactagact 
taaatataaa 
aatgcqtgat 
atactttgag 
atcccagttt 
tgtgggaagg 
cgaaatatac 
aaacagcact 
tggagctgat 
ttattatggt 
tgacttttca 
tgtacattgt 
acctgccatt 
aatttactgt 
tcttactgag 
gtagtcatcc 
aaagaattcc 
aaaagaagag 
agtattgttc 
ttacagacaa 
tcaggcactt 
gaccccctac 
ataacactga 
taaagtaact 
tttgaacatg 
aataaacatg 
cttttccatt 
attagaccgc 
tgttacccca 



ctataggata 
tatatatcct 
tactgacaga 
atcgtgagtt 
aatctagata 
taaataatga 
ttcaaatgtc 
ccjtatgcaat 
gt'aatagatt 
gtpcggtggc 
gg^gtagctg 
ctktctgtgt 
ccitcccaaag 
ta^ttaccga 
attttacatt 
caataaaatg 
taagtgtaaa 
ttctctaata 
cacactgttg 
cccgggctca 
gcaccatgcc 
tgccatattg 
catgcttagc 
aaggcttttt 
caaactaatc 
gccattcatt 
tagtctactc 
cagcgtcgtc 
attttactct 
cttccttcac 
gcttatttaa 
atacaattcc 
tajtaggaaaa 
cctctgtgaa 
ttjttctattt 
aaagttaaag 
ggtgtgtaac 
ttbtttagtt 
ctgcatacct 
cc'gtatatac 
gcjttctcaag 
ttcttaacta 
gtaagggatc 
tccaggtgct 
atggaggtca 
gccatgccag 
caagtgacat 
aggtgagaaa 
aaatctattt 
ggtgggttag 
tattatgtaa 
ggaaaaggga 
attttctgaa 
tttcaaggta 
aattagagac 
ct.ttgcttgg 
cg.taattaac 
agjgagtttga 
cgjttggtttc 
tdctctctgg 
tqtcattgtc 



ggaagtaact 
tgagctggag 
ggattgcgta 
aacgtgaaac 
tatcctaaat 
caaatctttt 
agaattcttt 
acactgaaaa 
ggggtttggg 
accatcatgg 
ccgtgccatt 
tgcccaggct 
tgctgggatt 
tttatcttgc 
agtggctctg 
cacagttctt 
atagtgtgaa 
tgcccttaac 
cccaaaccag 
agcggtcctc 
cagctaattt 
cccagtctgg 
tgattcatac 
atctagagaa 
ctaaactcct 
agaatgaaac 
ccttttatga 
tctgatacct 
tctctgctta 
tttagctgag 
gcaggataat 
agtctaacac 
tactaaaaaa 
caaaccagtt 
taaaatcaca 
tttcccctac 
tttagatttc 
aaatgaactt 
atggggatac 
ctagtttact 
tggaatcaga 
gaggttgatg 
tctagaaccc 
gtcctgggta 
cattctagtg 
gttgtttagc 
ttgtgtggag 
tgatggttag 
tagtggtagc 
taacacactc 
ataattccat 
acacccatgg 
gatgctctgc 
ttcatctagg 
gtgtttatta 
gttagtggag 
atggaatgtt 
agcatggcat 
attctagtct 
aattccaact 
agggtaactt 



tgaatgtgga 
tttaaggaaa 
ggactcatga 
tagatttatg 
gttgagatag 
tagcatgttt 
tttctttgct 
ctgatcattg 
tttttttgag 
ctcattgcag 
atttctagct 
ggtcttgaat 
acaggtgtga 
ttaaatcagt 
actgctaatt 
aagtttatat 
aaacacaaga 
ttctccaagg 
aggtgcagtg 
ccacctcagc 
ttttttttgg 
ttttgagctc 
tcttaactga 
catttataac 
aatgagttaa 
atgtttactt 
cacttctaca 
atagtcctaa 
tttgccattc 
agtgacagga 
aaaaactttt 
aattaaattc 
atgtaactag 
atttcaggtt 
gatgcaatta 
tcctacactg 
ttccaagagc 
cttacagatc 
ctctgtgcca 
ttccctcttt 
agttaacttt 
ttagtggttg 
agatccctta 
ccaagggata 
tgggaagaca 
ctggtgggtg 
ctctgtaaaa 
gggagtggaa 
tgatagggct 
agtcgcagtt 
ctttacaaag 
ttcacatctg 
ctggcaatgt 
aaagacatga 
actttgccat 
aaggctataa 
tagggaaaag 
tcaaggtttt 
agcttttcct 
caagcccttg 
ttatgtaata 



atgcttagag 
acttatggga 
aaaaggaatg 
ttagtttata 
ctgcataaac 
tgtgaagctg 
tcttttttaa 
aaatttgtag 
tcagggtctt 
ccttgaatgc 
aatttttaaa 
tcctggcctc 
gccaccatgc 
tggtaacact 
cccccttctc 
aaaataaaca 
aagaagataa 
attcatactt 
gtgcagtctc 
ctcctgagta 
tattttttag 
ctgggctcaa 
aacattgttc 
tggatctttc 
atttatattc 
agaattggag 
ttctttctgc 
caagaatatg 
atgtgaagac 
ctgtgtaggt 
actataggaa 
tggttaggga 
aactctattt 
gcatttgtgt 
tacattcaaa 
cgtacacctt 
ttttgagtaa 
agttttttag 
ttacgatgga 
tgtatatttg 
tcctttacta 
gaccattcaa 
attcctgcaa 
caatgtttga 
aacaataaca 
agaggtaggg 
gggccagctt 
agagtggatg 
ttgtgattga 
agtgagtgct 
taggcaccat 
tagtagccta 
ggttatattg 
actgccaatt 
acagaggtaa 
aaattacttg 
aggttttcaa 
ctaaattctg 
tctgggccgc 
cttttctcca 
ttaacatata 



actcagagta 
aattaaaagg 
aagttacctt 
gcctagaatt 
aataactgta 
ataaatgtta 
aaaatttctt 
gccaaaaaat 
cttctgtcac 
ctgggttcaa 
agtttttgta 
aggtgatcct 
ctagccccta 
tggaatttac 
caaatgctaa 
ggttfetcagt 
agaatttaag 
ttttttgtaa 
cactcactgc 
gctgggacta 
tgggggtaga 
gtgatccgtc 
caagtttctc 
tttgtgtagc 
tgaatcttgc 
aagggagctt 
acttctgcca 
aatcatacct 
cttaaataga 
gtgggtgtgt 
attaaacatt 
actgcttaac 
ttacacttta 
atagtttttt 
cactgccaca 
tcctaggtac 
gtgtttgaat 
tacagtagca 



aggcacggga 

tctgattttg 

ttttctcatt 

tagtaagtaa 

tattcccgtg 

tagacaatgt 

agaaaatgaa 

gtttggaaaa 

ggaaggtaat 

ttaagattga 

atgtggagga 

gctgtgtgca 

tcttcctctt 

gccaggagtt 

gttgaaatga 

acaatatagg 

agtaactctt 

gagtttttac 

ttgataacat 

ccccggttaa 

ccctccccac 

tctgtcatga 

taatactgat 



5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

720O 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 
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ataacattag catattttaa tgtatggatc atctcctctg caacattgta acctcttgga 9540 

gatggcaata at^ggaagaa tgacttgatt ttactttttc ttttaacaaa aatggtggag 9600 

tagtctgggc acggtgtggc tcatgcctgt aatcccagca ttttgggagg ccaaggaggg 9660 

tggatcactt gaggtcaggc attcgagacc agtctggcca acattgtgaa accccatctc 9720 

taccaaaaaa atacaaacac ttactgggca tggtggtgtg tgcctgtagt cctagctact 9780 

caggaggctg aggtgggaga atcacttgaa catgggaggt agaggctcca gcttgggcga 9840 

cagagtgaga ccctgtctca aaagaaaaaa aaggtaaaag ggccaggtgc ggaggctcac 9900 

gctggtaatc caagcacttt gggaggctga ggcaatggat cacctgaggt cgggagttcg 9960 

agatcagcct gaccaacatg gagaaacccc ttctctacta aaaatacaaa attagccggg 10020 

cgtggtggtg cctgcctgta atctaagcta catgggaggc tgaggcagga gaatcacttg 10080 

aacccaggag acagaggttg tggtgagcca agatggcacc attgcactcc cgactgggca 10140 

acaagagcga aattccgtct caaaacaaac aaacaaacaa aacaaaacag agagaaaagg 10200 

cagagtactc tagggaattc tagtctgtgt ttctgtggaa atgtatatga atctcacttt 10260 

taagggatgg agatttttga atggcataac tagttgataa gttttgctct aacagggtac 10320 

ccaagtctag tgagtccgat tcattctttc cttaaataga tgaaggagga agaaacatga 10380 

ctccaccctc aagagtaagg cagaatgagc aaagtcagag aagttaaaaa agaattctca 1044O 

cgcagccagc agtgcagaga aaccttggtt tagttgtgaa tcaaaaccag tactttttgt 10500 

aatttttgag cctatgcaat tctccaaggt tttatgttgt ttcttctgtt tctctgtagg 10560 

caccagaaat caaaacccca aataagaaag tgttacttga agattttaga gtacttattt 10620 

gtgtataagt gtaagtgata tttggaagac gactttactg cgctcctcca gcttggcatg 10680 

agaattccag gggcggaaag aaaggagggt gatggtacct ggaaaggaga gtcatgttaa 10740 

gtcccagcca catattaagt gctaaccacc tactgttaaa aggtgtaatg ttctagactg 10800 

acaaaataca tagtctctac cgtaaagtaa cacataattt agcagtgcag aaagatgtca 10860 

cttaaaagaa aacttgaata tatgctgaga tagttcacaa attaaagaaa. tgaacaaaga 10920 

actgaggaaa taaaggagga atacaactgt gtccaaatga atacttaact gggtgggagc 10980 

tgttgcatat gtkagcaggt ggttcaccta aaagttggat gtaacgtagt taacgccagc 11040 

tcttggtgca cttacatatt gcattgcttc cgggcttaat ttgtgttcat ataggaataa 11100 

attttttgtt ggitttttaat tttactcctt gtaattccgt ggttgatatt caaagtgaaa 11160 

aaaattacat aagcttctaa tatatgagaa gtcttctcac ttgacatttt ttatttggaa 11220 

tttttgcaga gagtagtttt gtcacagtca aaagattttg ggatcttgca gtgagaaacc 11280 

taggtgtaat tcctatttct ctgccattcc gtatgtcatc tggattaagt gtcaacttct 11340 

cagtctcaag attctcgtcc ttaaatggaa tactttttgt catgctattt tgaagacaaa 11400 

atgagataat ac^tgaaact gcctagctca gtgaatggta catcatagat actcagaaaa 11460 

aacacaccct ctaaaataag aacagtacca aaagacagga tgtaaaataa gggcagtacc 11520 

aaaagacaca tgqatgctga gtgtatgaga aagaactttg tggccttctt gggtggcaca 11580 

ggccatggca gttccacagc atgacgtggt tgctgtgggt ggtagagcag acatgccgct 11640 

ccccgtcact gcctggcttt gatgcttgct ttcttcagct gagaggacgc agctgtgata 11700 

tgaaggtctt gtgtgtacag tcgtgacctc acatttccaa tttcctgctg gcagaaccca 11760 

cagtctacaa cgtacgagca ccagagttga cgtgagacag acagcataca gaggcttgta 11820 

acatccttct ggaaaacact gtgtaagctt tcagtgcgaa taaacatgat cagtggcaag 11880 

ttctgttaga tgtagtctgc aagcatcctg attttactgg gcaagactat gttgatttac 11940 

aggcggctga tgattccatg gatagcccac tactagtatt ttcacaaatt tcacaagaca 12000 

ttcttactgg aagattgccc tgttcttatg atactgctgc ccttttagct tcatttgctg 12060. 

ttcagactaa acttggagac tacagtcagt cagagaactt gctaggccac ctctcaggtt 12120 

attctttcat tcctgatcat cctcaaaatt ttgaaaaaga aattgtaaaa attacatcag 12180 

caacatatag gcttatgtcc ttgagaagca gcagttaatt acctaaacac agcaagtacc 12240 

ttagaactct gtggagttga attgcactat gcaagggatc aagtaacaat aaaattatga 12300 
ttggaatgat gtcaagagga attctgattt ataacaggct atgaatgagt acctttccat 12360 
ggtcgaagat tgtaaaaatt tgttttaagt gcaaacagtt ttttattcag ctttgaaaat 12420 
gacttgcata aatctggaga aagattatca ggatttaata tggtgaatta tatggcatgt 12480 
aaacatttgt ggaaagcaag tttagaacat cacatattct tctgtttgga cagaccactt 12540 
ccaactagaa agaatttttt tgcacattat tttacattag gttcaaaatt cctaatgcat 12600 
ggtgggagaa ct'gaagttca gttagttcag tatggcaaag aaaaggcaaa taaagacaga 12660 
ctacttgcag gabcctcaag taagccattg acgtggaaat taatagtttg ggaagtagta 12720 
ggcaggaatt caktatctga tgaaaagatt agaaacataa agccttccat cacaattccc 12780 
acccggaaca ggkattccta ctcatcaaaa ttctgcattc atacaagagg gaacctgatt 12840 
atgaccatct tcbgttggtc atttggtaga ttatgtggtt cacacttctt ccaaatattt 12900 
gcaaatcaga cajtcaccatt atcagcacaa gctaatagca tcattctgga atcatcacta 12960 
ttacaggaca ccbctggaga tgggtagcct ccagctttac cacccaaaca agctaagaaa 13020 
aactgttgga acbaaattca ttatttacat tttcaacaag atctggaaga tcatattaat 13080 
gaaacgttga tgttctatct tctcttaaaa aatctgctcc taatggtggt attctacatg 13140 
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ataatcgtgt tctaatccga gtgaacctga cgaaaatgga aggtttggag tcaatgcaaa 13200 

gggggatatg atpagaagat gtctgtgatc gtgtcctgag aagcaccagg aacacctttg 13260 

acctcagtga ctirtcgattg aagagaagac caagttgtat tgatcagtgg ttgggacttt 13320 

acagaacaca cccatgattg ggttgtcctg ctttttaaag ccaactgtga gagacattct 133 80 

ggggaactca tgcttctagt tctacctatg ctgcatatga tgtagtggaa gaagtgctag 13440 

aaaatgagac agacttccag tacattctgg agaaagcccc actagatagt gtccaccagg 13500 

atgaccatgt gctgtgggag tcagtgatcc agctaaccga gggcttatcg ctggaacatt 13560 

ctggacacaa tttgatcaac ttatcaaaaa aaaacttgga atgacaattt ctggtgccag 13620 

attaccttag aacctttgca aaaatagata gagatagttt tccttatgat gttacatggc 13680 

ttatttttaa aggtaatgaa aactacatca gtgtaattcc agcatcataa gtcagaacag 13740 

tgcttgtcaa ggggcgttac cacacacttg aacagatttt tggcagatga cttgggaaca 13800 

aggctcctcc atgtttgtaa tgttgaccac acaagttgaa tgtggcagag ttaaatgacc 13860 

ccaatattgg ccagaaccca caggaagttc atcctatgga tgctaccaag ccttctgcca 13920 

ctgagaagaa ggaagcactg tctttatctt caggaagatc acactgctgt ttaaccaaga 13980 

gaaaaattag agagtcatca atcacgcaga tccagtacag agggtggcct gaccatggag 14040 

accctgatga ttcagtgact ttctggattt tgCttttcat atgcaaaata agagggctag 14100 

caaggaaaaa ccccttgttg tttcttgcag tgctggagtt ggaagaacca gcgttcttaa 14160 

tactatggaa acagccatgt gtctcattga tctcattgaa tgcagtcagc cagtttattc 14220 

actagacatg gtaagaacaa tgagagagca gtgagccgtg atggtccaaa cacctagtca 14280 

ttacagtttt gcgtgtgaag tactaittttg aaagcttatg aagaaggctt tgctgaagaa 14340 

agcaaaagga aaaaaagaac tttgtcatct gttaggttcc atttattgca tgataattgt 14400 

gtttgtattg attattgggc aagtagctgt ttgctatttt gatcttattt cagaagggca 14460 

taataatttt actattcaat gaaacgtttt aaacggggta gaaaaagact agtttttgta 14520 

tgctttacag cagaaatctt ataatgatta actggtaata tatttcgttg gcataaaaat 14580 

acatttaaaa gttcaagtaa ttataaacat tgtaaattgt atatgtaatc atattgaaat 14640 

tgaaattctt tatagctgta cttctgtgta atcaaagact ggggagagat agactagcta 14700 

gctctttctc ttktccatta atcacttaac agagttttga ataaaaagtt ccatttcatg 14760 

ggataagaat aabgacaggt taacctattt tagttggtta ctatgttcta ggtgttgtat 14820 

gaagtagttt acatagtttc actgatttca ctacaatccc aggaggagta gttactatta 14880 

ttacactcat tttacaggca aagaaatagg tttggagggg ttgggtgttt tgcccaagtt 14940 

ctcatcgtaa aatgacagat gaggattcaa attcaagtct taattgaagt ccattacttt 15000 

agaacctacc tcttagtggc tcttatgtta cagtataagg gagagcagac tgttccttta 15060 

cccttgtagg gtagctaggg cttgtgaatt aagagactga ttaacaggag aagaggcata 15120 

cacattttat tgacgttagt atttttacat gcacagggaa ggagggtttt atttttattt 15180 

ttatttttat ctttatttta aagagacagg ggtcttgctg tgttgccagg gctggactca 15240 

aactcctgaa gccaagcgat tcttctgctt gagattcctg agtagcaggg actataggtg 15300 

tgctcctctg tgcttggcta aagaaggggt ttgtatgtga tttttaacaa aggctgataa 15360 

attgtgaaga agtgactagt caaaggagaa gaggatttca gctcccaggg gtggtaaatt 15420 

gtgggaagat gactaggaaa tgtatagtaa taaggtttgc tatgcaggtt tattttgcca 15480 

gtttctggtc tcctaataag ggacagggaa acacctttac agatggaaat tcatatcacc 15540 

tttccacagg gaaatttatg tcctgcctta ggcagttagg ggaagggcag agaattcttc 15600 

ctgtatctgc tgtgtctcag gtgccttcag ctcaaaataa tccttatgcc aaagtagcat 15660 

atttgggtgt ggcatattct ctgatctctt tcaacagcat catctatact taacaacagc 15720 

aaaagttttt tttaaaaaat catgtttcaa gatttgcatg tggaagacaa atggacatga 15780 

ttgagataaa tgaagaatat atatttttta acaaagaatg ctgtatattt atgtctctgt 15840 

gacattgtgt tatggaggct aaggtgttaa gcatgtgatt actttagatg ccgtatgact 15900 

acctgttttt aajgattaaaa aagaatcaat aggcagttta tatgcatggg agcaagttaa 15960 

aaacaacaca gafegtgatga aggcgaggtg aaactggtcc gcatctaatt caggccttct 16020 

cctgaaagcc agjtgtgtgca agataaataa gtttgtttga cgaaagcaga ataactagtt 16080 

tgtcctttgt gajtgaagata gttattcaga aatcattttt attggctacc tctgaattaa 16140 

taaatgaaaa gajgaaatttt tttttctgta ggggatgtct gatgagttct taaaaagtgg 16200 

atgaacctga aaittatcatg aacaagcaat tataatgaac ttaaaattac ttaaagagtt 16260 

atgaaaaaca aaiaagaaaag ccgtatgttt tcttgtgcct tattttgaag tgacaaatta 16320 

tttgcagggt acatttgtag acggaactaa tgtgatttaa aaaatgagta ctagatttac 16380 

agaatgatgc ctittaaaaag tcactggtgc actttaatta ttttatttat gtttattctg 16440 

aaactacctt tattttgaaa atgaggtata gctttgccta ctggtgacaa aagtgtaaat 16500 

aattcagtaa acatctgtta aaaaccagct tggtgctagg ctcttggggt agaaaactga 16560 

tcaggccatt gaggagctca tagtccctaa ggggctgggg acttgtcatt aggtgtgcag 16620 

tgtgttctgg atgctcctga aggagtgtgg gcaggtgcgc accaccatgc ctggctaatc 16680 

tttttataat tatgtagaga cagggtctgg ctgtgctgcc catgctgggt ttgaacttct 16740 

gggcttaaga gatcttccct ccctgcccct accgaccccg cccgcccact ccacctcagc 16800 
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ctccccaaag cactgggatt gcaggcatgg gccactatgc ctgggctgtg caaaactttt 16860 

aaatcagtgc atactcaatg gtcttgatgc aattctggct tgttggtaag agaatgggga 16920 

tttactcaca agccacgatg tcacttttaa ctctgaacag atcaagctat tggtattact 16980 

catttatgtc atcgataaac tttatgaata aaaactcatt gtgcaaatat ttaaacatac 17040 

tacatacata gcactgtgca gtttctaagg aaagtaatgg aaacctttgt cacatccctg 17100 

gcttccagaa ctttatgtta tctaagtgca tttgtctgca aagttgttgg gttaattgcc 17160 

cctttctttc ttctcttttt aagatattaa taaatagtgt catgaccaaa agataatcct 17220 

tatggacaag atagatctaa aaagccttag ctaatttata atcttgcata atccatgatg 17280 

acaagatgca gaaacaaaaa tgcccagaat aaaaacttag caccattagc agccatttcc 17340 

ttttaagtct ttacaagtat actcccagtt tcttgaaaaa tttattctaa aatatgtaag 17400 

acacacaaaa cagcagaagg actaatacag gtacatcgaa cacctgtgtg cctaccgccc 17460 

agtttaaaaa taaactggaa tgatgtttct ctcatactta cagaataaag ttttaatctt 17520 

tagcatggaa ttcaaaagac ttctgccatt ccagttcaga gccacccttc tggtctcctt 17580 

gctcctcagc cgpgacactg cccatgtacc caacaggcct ccagggttac tgcttccatt 17640 

cgttcttatt ctbatgaaca ttttccttca tctcatctgc cagaatccta cctaataata 17700 

ctcctgctct gckgtttaca gttctttaaa attaaaaaag gtGgtgtacc ctttagtgtc 17-*760 

ctgaaaaaag aaaaaacaaa tttaaaacct taaaaaggta ccatattttc atagtatttg 17820 

cgttatgtct cattacagtt cctgtggaca tgtctgtctc ttttactaga ttgattgtgg 17880 

gctctttgaa ggaagatata tcttatgaac agtgttttat atattgttag caatcaatga 17940 

atgcttgcta tatttttctc atgaggatat tgattattct attttaattt attaccgtta 18000 

acctgtacta tacataactg ctttctgtac ctgagctatt tatgatctct gaggctcctg 18060 

tgagaaatct aabttttgtt aatcatggat ggaaatattc acaacatcat tcgtcagttt 18120 

cttcacattg tcttcctttg tatattacag atgttttaaa atatcaaagt aatgtttttt 18180 

tgttttatct tttagatatt gctatatgga gatttgccaa aaaataaaga aaatataata 18240 

tatttagcaa atcatcaaag cacaggtttg tatttcattt gcatgaaacc taggtttttc 18300 

tacagatggc acatgggcat tcaaaatacc gttcttatat ttaaatgaag tgggtttttt 183 60 

aaaacagcaa ttttctgtgc agatattaca cctgttcttg tatttttgtg attttacttt 18420 

ttggaaagtc agaaacttga aagctatgaa ttttcctaaa cttaccttct ccctctgttg 18480 

gatgtaagta agctatcttc ttacttgctt gctttgtttt tcctttgtgt agctctttaa 18540 

agagtgtatt cattcttttt gtaagtgatg tttctagaag tagcattggt gggtcgaagt 18600 

gtgtatacat tttacatttt tgattgctaa gctgcagaaa agctgtattg gtatgtaagt 18660 

actcgtttcc ttactatgct cgtcatttct agtgtctgct cttcctttcc ttcttcaaat 18720 

gggtttggtt taattctagt tgctactgtt ccatcagagg aattgcagag aactggtctt 18780 

caaaacagtg cagtatatac tttaggtgaa gatacttcta aaaacctttg tattttgagg 18840 

taattctaga gtcccaagaa tttgcaaaaa gagtacattg tcagcaatat ttttcccaat 18900 

ggtgacatct taatataact gtagcacagt agcagaatca ggaaattgtc attgggtaag 18960 

gtacttttta attctccaaa taattcagcc ctccaaaaaa atcccacttc ttatgttttc 19020 

aaacctgtag ctacttttga tgcgtacttc ctaaattgca tttttattac tttaaaaaat 19080 

ataataccta gaagctcaaa gctggaaaca gcctgatcaa tatagtactc ttaagctaaa 19140 

aacaacctga tcaatatagt actcttaggg aaatcactta tgcctgtggc tttttttaaa 19200 

ttttcttcct gtbagctgtc tcttcatgat tttgtggttt ttattactgc ttataccata 19260 

gatgaggtat ag'aaagtaaa agaagttaaa atgcattttt ctcaatttag tgaattaatg 19320 

attacattca gafcttatagg acaagggttg aagctacaag gggttgatag gaatcttgat 19380 

gtatctgagt attttcccca actttattac atgactggtt cagactattt tatctaatta 19440 

catttcactc ttggcagaaa tagcaaaaca gtcaaccaat ggtcaatgct gctgagaact 19500 

ctggcctgtg capacatatt ggctgtttta cttctaatac cattctgctt ttcctgtcct 19560 

gctgctgatg gajtgtttctt ccaggtttta aatatcaaac aaaagggatc tgtgggccca 19620 

gtacagggaa tggctcttga tagatttgat tttcctgcat ttcctttatt ttgatccagt 19680 

gttaatttca tgtagagttg tctgtttaac aggattctct taaaattcct tcttcagttt 19740 

acctgccagc ttttctttgt ccaggtttca gtatgaactc cactcgatta atagagctct 19800 

ctagtagtga cttgtggagt gggttctctg aacatttctg gaagtgttgc tgatagtgat 19860 

aatattgatc actagtactg ttaatttgtg tgcttactac atgttggctt ttatatgtat 19920 

tccttcagat taaggacttc tagaaaacat ccatgaaaaa acagattaaa aaaaacaatt 19980 

ctgcatgtat ttgggactag aaggtactat gggaaggata atcttcatac tcagaccata 20040 

ctgacctgaa tttcatttat cagtttagag aaccacttcc ccttcccttc accctacctc 20100 
cgagtgcctg tgactttgta tcaccgctct ggcaccacat cctcatccca gcaggatttg 20160 

ggaaggctgc tttttgaaag ccttttaaaa ttctgtaagt tgagaaaata ctaggggaat 20220 
gattttaaat ttctttagaa ttacaggctt tagtcagtat atgacagagc cttttcctag 20280 
aaaaatgtgc atataaaaat ttgcatgtag ttttagggtt tcagagaccc ctaaagccta 20340 
tccatagacg tggttcattg tctgattgtg tttaggtacc cttctaaaac ccttttgaga 20400 
tgttaggaat cacaacagag tatctctgaa aatgtaatta gcggaaagaa catttcaaag 20460 
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actgttgttc tgcttagact ttctagtttg tcttctgcca ggcttgccgg aataaatgag 
tttcctggcc tgatactcaa aagaattgac atttaaatta gtctctctct tcccttgttt 
tcgcttgaca catccttgtc tctacattct gtctctgtct ctgttagctt atttctctct 
cgagtcagca ggatatagtg gctgttattt cttcccctta tccttcaacg atctactttt 
gacaacactt tgcctttttt tttttgagat ggagtttcac tcttgttgcc caggctgggt 
gtaatggtgc aatctcagct cactgcaacc tttgcctccc gggttcaagc cattttcctg 
cctcagcctc ccgagtagct gggattacag acatgcacca ccacgcctgg ctaattttgt 
attttcagta ga^atggggt ttcaccatgt tggtcaggct ggtcttgaac tcctgacctc 
aggtgatctg ccjtgcctcgg cctcccaaag tgcagggatt acaggcgtga gccactgtgc 
cctgcctgct atbtgccttt ttaatctcat gaaatgttct cttttcttgg ctgaagtgtc 
acttttcttg ttgaacagca tgcgtggtga gtagaatgtt ataaaaaggg atggactttg 
gagttagaga gabccaggtt cctgttcggc attgcagaaa tgctgttctg caataggctg 
tgtgtcagtg ggbaaattac ttatctctca gagccttatt ggtaaggtgt gagtgatagc 
tcctttcagg caccttacag aggctgtctc ctaatcctgg tagcgtacct ggctcataga 
tggcatttaa aagtggttgt gatgacagtc atagctcacc attagcatag cgctggatcc 
atggcaggga agcgctgcac atgcagtatc tcttggacta cacagggccc tcatgaatta 
ggaactgctg tttcatgagg atagggatga ggaaattaga cttgctgccc ctcactgcct 
tccactcctc tcctccaagt taatgggaac tatgactctg ctttggcttg attgccatgg 
aagattctca cacagccaaa tttattgcta tcttagttaa attatgccag aacacaaaat 
atgaagttat tgtcaaagta atataatctc agctgtaact gagatagtca gaaactgtct 
gtaatctgat gtcctatctg aaaggtagct gagaataaac aagaaataaa gagaattcag 
tagcaaatat tggtgacaca aagcttttat attttgacta gttaagctag ttcttaaatg 
tttccactaa aatattcaag tttaagggca tagcccaggg cagcttatta tgaacatgat 
gtattttgga aatcttacac tttctcttaa aagttcttgg gaggggcatg tgaggccata 
atataaccat aaaaccattt gttttaaaat aaaacccatt tttaaaattc ttccaaataa 
aaaaattatt gcaggaaaaa atgctaaacc tggtttttaa ctttgtacgc caactatatt 
tccaagatgt gctgtagcct ggtaaccata cagaaccata cagaattagt tctcagaatt 
tattgtctgc ttacttttgc atttggtaca ggtataacag ggtcgattat atggtttcta 
agacatgact agaaagaaat atgtttatca gttattattt cttccatcta aattagaagg 
ggctagggag agggcttcaa caggaattta tatactttag agaaaagtga tcattgatag 
cccaatagta tagatatctc aacccaataa cacaggttgt gtctgtctct gggatcatac 
actgtagggg agaatctttg caagcaacat tctacttata gggagccata acaaaagttt 
catatgtata ataattataa gtcttaagtc atcaagaaaa agttaacttg tgaatgataa 
tccctgatta aaaagagaga tgtataataa tggataagag atttttcttg gttaattttt 
agtattaaaa tggctaaatc ttctttggga tattctgact agtatggtgc attgtctaat 
agatttccca tapctgagag ctaatcatct tgtaatctgt ggaaaactgt cctctttggc 
taaaacttta ttgtaattcc tctaaatcct cagcttttat tttctacaga cttttttttt 
tttttaacat ttbcttcctc tgactcactc cttttgttct cattttcatg gcctgagaac 
atgggtgatg atpgaattat tcttttcaca gattaacagt tttcttttcg agtatcgttg 
agctcatgtg tgtattaact agagaagtct cccttacatt tcatttttat gttttctttc 
tcatcaggag atagtttgta gccatttact ttcaaatcca agtttctgcg gttcttaaga 
cctgtatcat ttgtctcctg aatttcactt catttcctct ttaaaccatg tcctctgttt 
cccatcttct gcacccactt tgccacttcc tgtttgttta attggcaagg gccactctct 
gtgttggaaa ttttttcttt ttgaaagctc aactaacaac ttctaggaag ttttttattg 
ctactgttat caattcatac catcttaccc ttgtttttgc aaccctttgt taataacata 
tttatttaac tatagttatt agcagtctga gatcatttta cttggttaca taaggagcac 
atatatctac ccagcatcat tgtaaggcat gtgagacctt tgtttgattg ctgtcctaac 
ctagtaccga gtcctaaaaa ctcattagta gaagatgaag tgtccttgcc ttttgctgaa 
catatatata cacactgaat atttagtggc aattcatagt tgcatttggc cattttttgt 
ttataatttc ccctttctca ttaaaaaaac tttgttttct agactttagg atttagagaa 
gctcattttg ttccatacac atgctgctgt tggattattt aggtattttg tgactgtatt 
ttatctttga aataaaaagc ctttcaagaa atgcaaaaaa aaaaagctca aaaaacagaa 
aatgtatatt ttttaaatat ctcagataga tttaaagaaa ttttaaacat cctaatcata 
gtacttttga agcccattca tagtacaacc tgtgaagagc ctcatgtacg cgctaactgg 
gtcctgtctc tgcagttgac tggattgttg ctgacatctt ggccatcagg cagaatgcgc 
taggacatgt gcgctacgtg ctgaaagaag ggttaaaatg gctgccattg tatgggtgtt 
actttgctca ggtaacttgt ttccatgctt ttctctctat atatgtagtt tataaatttt 
tttttttttt tttggagaca gtctcacttt attgctcagg ctgagtgcag tggtgtgaac 
acagctcact gcagccttga cctctggggc tcaagtgaac ctcctgcctc tgcctcccaa 
gtagttggga cc^tagtgcc caccatcatg cccggctaaa ttttctattt tttgtagaga 
tgggggtctc gcfcgtgttgc ccaggctggt cttggactca agcaatctgc ctgtctcagc 
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ctaccaaaat gctggattat aggtgtgaac tgccataccc aaccctataa aaatgttata 24180 
ttttaaaatt takcaatata cttcatgtga atgtatggtt tttaaaatgg gtttaatagt 24240 
ttattctcag ttgaagtaat tttgtttggc atttttagtg gtgtgtattt atatacgtct 24300 
gattatccat atpcggtttt ccttcagcat ctgtggggat tggttttaga accaccacag 24360 
ataccaaaat ctaaggtgtt caagaccctc atatagaatg ggatagtatt tgcatataac 24420 
ctgtgcacta ctttaaatca tctctagatt acttataata tctaatacat tataaatgcc 24480 
atgtaaatgg ttgttatact ttatttttta tttgtattat tttaattgtt atattatttt 24540 
taatttttat ttgttcacat atttttgatc tgtgatttgt tgaatctgca gatgtggaac 24600 
tcatggatgt ga^gggccag ctgcagtaaa atgaaagagc aaaaatgcaa atgtacaaag 24660 
ttcaaacaaa ta^gaaattt aaaggcatag aatttgatag gcaattacat taaactgttg 24720 
ataacagtaa ttkgtgatct gtatgatatt aaaaaaaaaa agcaaactgt atatataaaa 24780 
cttactttct ccagttctgg aggctagaca tccaagatca aggtgttgac agggttagtt 24840 
tctcccaagg cctctctccc aggcttgcag acagcatcct tcttcctgtg tcctcaggtg 24900 
gtttttttcc ctgtgcccaa gcacccctgg cactgcttcc tcttcttaga aggactagtt 24960 
acactggatg actaatcctt ctacagagac tgctaaggtc ccactctgag gccctttttt 25020 
aaccttaatt accacctcta agtccctctc tctgaatacd gtcacagtgg gaactattag** 25080 
ggctttagta gactgatttg ggggaacaca cttctgtccg taacagtgcc acataaatat 25140 
ctttagcagg attgattttt taaaatccct aaagatcgtg agtattgaca tgttaaggac 25200 
gctttttagt gactctgtaa taagtgggtg gaagaattgg gagttaaatc catctgatgg 25260 
atcaggtttt ttatttttaa aaatgtgtat ttaagaaaga aagcattttc attttaactg 25320 
ccaacaaaac taaacttcat gtgttttcca atacagtgtc acatgcagtt tttttgaatt 25380 
atgttgagac aaggcaattt tcagctaaat gttctttaga agctaatgtt tgaagatatt 25440 
aaatatagat taaattctga aatgtagttt tcattctgta ctttttgcaa gagaagttgc 25500 
ctttttgatg actctggcca attgttattt taaaagtaaa tgctctttct cccgatttga 25560 
ttgtggcagc atggaggaat ctatgtaaag cgcagtgcca aatttaacga gaaagagatg 25620 
cgaaacaagt tgcagagcta cgtggacgca ggaactccag taagagccta cccgttttta 25680 
tttttcttac cagctctcag tttctaaatt taagaattaa attaaaatct aagaattgtt 25740 
ttgacaatgt attttcccat gtgtaattac taattcaggg ttatgctgag gtaacagaaa 25800 
ccctctatgt ackggtaggc aggtttttca gccatcagaa agattgctgt aaacaactag 25860 
gtcctttgct ggtcagtgga ccttaaagag gaataaaaag agcatttggt gtcgttcaga 25920 
gtctataaat agaactaact gcattttaac ctgacattta agctagttta caagctcatc 25980 
ttacttcttg tcbtctttag tatcagattt ggttttagaa gcagcaactg ttttctgtta 26040 
gtgcaaattt tgkatgtctt acatgtacag aaaaaccaaa aaaggatgaa tctctacaaa 26100 
tgttaaatca ttqagtgtaa ataatatttt ataaaacttt attccacaaa agtggggaga 26160 
gttcaatctg ctttgtatag aatgctgatt gctgccaaag gcttttcccc tggttccctc 26220 
cggagacaaa gcaccatgat caccggggcg acttgggctt tctctttcag tacatgacat 26280 
gtgctcagaa gcttagctcg tgtgcacagg ctttcccttt cctttctggc tccctccctc 26340 
tgtcttccct cctctcctct tgccctcccc tcaccagggg tcctgggcag cagctggagc 26400 
tcatggtgaa ggaagaattc ttcatggtca gctggcgaag tgcctggtgt gagcattgtt 26460 
tattcacatg cctcttctag gtgtttttac attagaacat tgcatctgtt ttgggcatgt 2652 0 
gttgggtgac agaagcagaa tggaatgaga tgaacagtga ccctttatcc tgttatagct 26580 
aacccttgag aaccaagctt ggtgtcttca aagggtctgt ttagtctgaa acagtgtggt 26640 
gaatttgggc agaattgtgg tcattgcatg taggtctcca aaagacagaa taagttggta 26700 
atatggttta tcgacttttt acaaaaaaaa tttaaaaatc atgaatttat accttaaaat 26760 
gtccatccca cttctctccc agctgtccag tcaccccagc aatggatgac tgctgtggag 26820 
ttccttctgt gtcctgctgt gggcattgta tatatgaagc aaatgaagat agctgccttt 26880 
tgggtgatgt tggcatccta tgcacagtgg tcccttgctt ttttgccccc atgaatatag 26940 
ctgccagtgg cgctagggct gaaaaaatca gctctttaca cttgtcatgt gtcttgttta 27000 
tgtggctgcc ttcgtgagtt tcttcttgtt tttggtttgc agcagtttaa gtatcatata 27060 
tctgagtgtc atttaaaaat ttttacctgg attggtcctc tgagcttgga tctatgattt 27120 
ggtgtctgtt attaattttg gaaatttctt tgctcttatt tccttaaata ttattcctac 27180 
cccagtcttt cttctccagt tatgtttgtg ttggttcatt tctcgctgtt ctttagttct 27240 
tagatgcatt at.tcgttttt tgttggtttt tttttaaatt ttttttttta cgccccctcc 27300 
cttttttctt ttltgtgttac attttggata atttctgttg acccaccttt gagttcatgg 27360 
attcttcctt tggctgtgtt gagtctactg gtgagccagt ttaaggcact cttcatctct 27420 
gctactgcgt gtjttcattcc tcacatttcc ctttgaccct gtttcatagt ttccatctct 27480 
gtgctagtgt atctatctga tcataaagct tagtcacgtt ttccagttga acctttatca 27540 
ttttattata cttgcagttc tcttaaattc cctgcttgat aattccaaca tctgggccat 27600 
atctgagtct gc^aattttg attactttat ctcttcagat tgtgctttat cttgcctttg 27660 
tcatacttcc taiagattttg cctaacgctg ggcctttttt gtaagacagg agaaatggag 27720 
gcaagttgtc ttlgatacctg gaaatggata gacttgtctt tctgcttggc ttttagtgtt 27780 
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gaggagtgga gtcagtccac tgaggaggtg cactgcattt gggttttgct catgtgcttt 27840 

ttctcacagc ttcaggtttc tgtagaactc attactttgt ttgtaggttg gggatgtcct 27900 

cccgctagag cttttcctca gtgtctattt cacactcagc gttttcacat agcaccttgg 27960 

agtggctctc ttctttatgc ctttccccac tatacttctt ggatacttgt tactgaactc 28020 

tcgctagttt ggtggtagaa ggagagggaa gggaagtgtc ttttcattct tagggagaat 28080 

ctcaggggtg gagccttctc tgatcctgcc ttgcttctgg ctgtaagtct gtgcccagta 28140 

tgtattcctg cctttactaa gagtttttcc ctgttctctt cacccagcct catcgagtat 28200 

tcatccgtgc cccatgggta gcagggtttt gttgcccctg ttcatcagtt tcaggctgct 28260 

gttccatagg aaaggtagaa agaaggatgt gggctgggcc ctgagccctt cccacagggc 28320 

tgcttttccc tcccacaagc ctacatccag tcttccctga ccgcagtgtg ttttcttttt 283 80 

tctttgtctt gtgagtacac aggaggtctg tgggtcgagc ctgtgaaatg tgctgcattc 28440 ' 

tccttgtgtc tgtagcccag gggttcgtct gttccactgg ctcatacttg gctttctgca 28500 

aaattgataa aatttttagc taaattcttt ttactggtat ctgttacatt ggcccccaac 28560 

taaacaacca cttgcatctt gtttctcctt tgagttttcc atctttcctt agacttttgg 28620 

gttagttggt tgccttgcaa ccttgcagct ctctgaaggg tctaagaaaa gtcatgaatc 28680 

tacagcttgt cagtgttgtt gttgttgtag ggttggcagt agtattcctt cagcattcta 28740 

catacttaat ggaagccgcc tcccattttt ggttaataaa tttcaaaact tggaacaatg 28800 

ttagatttac aaaaacgtca gaaagaacag agtgttcctg tttattcttt atatagcttt 28860 

tttttttttt tttttttttg agttggagtc tcggtctgtc acccaggctg gagtgcagtg 28920 

gcacgatctt ggctcactgc aacctctgcc tcacgggttc aagcaatctc ctgcctcagc 28980 

ctcctgagta gctgggatta caggcgtgca ccgccatgcc cggctaattt ttgtattttt 29040 

agtagagaca gggtttcacc atgttggcca ggctggtctc gaactcctga cctcttgatc 29100 

cgcccgcctc ggccccccac agtgctggga ttataggtgt gagccaccac gcccagcctt 29160 

cttcatctag ctttaacatc taatgttgac atcttacata acatggtata tatttgtcaa 29220 

aactaagaaa taaacattgg taccacacta ttaattgtac tacagatttt tattcagact 29280 

ttaccaggtt ttccactaat gtcctttttc tgttctaaaa tacaatccag aatagataca 29340 

aatccattca acttcagtgt tttaaattat tgtttttcat tatatgaagt gctgtgtggt 29400 

ttttgtcaaa tctgttattt tggttttaat cttcaagctt gtctttgttt ctttaagtga 29460 

taaaggcata atttaaaagg tgtgttgggt tatttcagtg cctaaagtct tgtctgagtc 29520 

acttgttttc tgctgttctt gcttatggta ctttctttcc ttgtttgctt tgttatcttc 29580 

ctttgctgct ggctgtgttt ggttaagtta tttgtggaaa tcagttgaag cctcaggtgg 29640 

gagtgtcttt ctccggagaa catttctacc tgttttagct gggcccctta aggctcctct 29700 

agcgtgggcc ccacccaaac gagattctga gttgaaggtg aactgagcca ttcaggcagt 29760 

gcagccaggg ttgcagatgc acgtgagacc tgctcacctc tcatttactt tcaccctgag 29820 

agtagagcct ttggtgtttc gttcacttgt ctgattctct cttcacagtt ctattagaag 29880 

gtccatgggt tttggtttct gtgcccttca tcttatgagt cttgtaaatc aaagttctgt 29940 

tttatgctta cttctgcttt actgtgtttg cttaatttca gtcttaacat cttgccaact 30000 

cttgggtact tttaaaataa tgttatatcc agctttttaa gttgttttca gtaggaaggt 30050 

tgattcaaat aacctagtct ggttatgggc tacgagaata gcctccctgt tttttgtggg 30120 

caaaattcca gccttttatg ttcctagcgc agtgtggata acagactggc aggttcaaga 30180 

ggccgtgctg agcagctttc actgtaaggt cactgtccca ggtcgggttt ctaagaatct 30240 

ggatggttgt ttcatttctt aatatgtacg ccctgtgaga gcggatacat cttgctcagg 30300 

ttcttatgat tcttttgttt ctgaaggtga attaagtaag tgacatggta gaatatgtta 30360 

agtcaacttt cgtgtggctt actagttctc atgaatctat tccatgattg tatcagttct 30420 

tattcagtat tagtatttaa gaaatgcaga attttgtttc aaaaaatata tttgtattat 30480 

aagttgtgaa gaaatacatc tccataatta ttgctgggac aatacagtat tttcttaagg 30540 

aacttattgg ttgtggatgc aaatgaagca tatttgtgat aaaaataact aatagaagtc 30600 

attttgttag actatgagct agtaaaactt atggcacaaa catggagact taacactttt 30660 

tcttccagct ttcacttaag ttccttttca gataggaggc agcctggtgg ataagagtat 30720 

tggttttgaa attagattca ggtttaaatc ccagatcttc tgtttaatct ttattttatt 30780 

tcaggtagat tttctggata acttgctata gcttatacgt cagtacttgc cacttcaatt 30840 

ttatgttatg gagagacggc ttctttcctt aaacctcacg aaccaacctc tgctagcttc 30900 

taagtttttt cctgccactt ctttacctct ctcagccttc agagaattaa agggagttag 30960 

ggccttgctc tggattagga tttgctttaa gggagtgttg tggctggttt gatgttttat 31020 

ctagagcact caaactttct ccatatcagc aataaggctg ttttgctttc taatcattca 31080 

tgtgttcagt gaagtagcac ttttaattct ctttaagaac ttttcctttg catccgcaac 31140 

ttggctgttt agtggaaagg acctagcttt tgacctacct tggctttcaa cataccttcc 31200 

tcactaagcc atttctagct attgatgtaa agtgagagac atgcaactct tcctttcact 31260 

ggaacgctta gcagccattg tagggttatt aattggccta atttcaatat tgttgtgtct 31320 

cagggaatag ggaaacccaa ggggcggtag agagaaagag agacaggaga acaggccatc 31380 

attggagcag tcagaacaca cacgacattt atcaattaaa tttgtcatct tatatgggtg 31440 
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caattcatgg cacccccaaa caattacaat agtaacatca gagatcacag atcacaataa 31500 

cagatataat aatatgaaat attgtgagat taccgaaata tgacacagag acgtgaggtg 31560 

agcacatact gttggaaaaa tggcaccaat agacttgctc gatgcagggt tgtcataaac 31620 

cttcaatggg aaaaaaatgc aatttccgtg aagctcagta aagcgaagca tgataaaatg 31680 

agatgagcct gtcactccta agaatgttcc tgtacaagtt ttttgcatct gttacttacc 31740 

ttttcctatt tgtgaatagt atcttttttg agtacgtgtg tttttttatt tttatacatt 31800 

tatatgtatc ttttgaagaa catactttta agcttaattt attgattttt tttctctcat 31860 

aatttccact ttttgtatcc tatttaagaa gtccttgcca aacttaaggt tgctaagatt 31920 

ttctcctttg ttttcttctg gaaattttag agttttgctt ttacatttag ttctaggatt 31980 

tatttataat taatgttttc atatggtgta agatcgaagt tcatattttt ttaatatagg 32040 

taaccatcac tatagaaaag attatttccc cccaatgttt gaaataagta gactgaatat 32100 

agatgggtct gttatcccta gatcaatgga gcatttgttc tgttatattg atctatatat 32160 

atatatcctt atgccaatac catactgtct taataatgct tgctttgcag taagttttta 32220 

aatagtgtag ttgtcttcta aatttgttct ttcttttcaa agttgttttg gctattttag 32280 

gttttttgca tttctgtgtg aattatagaa ttagctcgac aatttctacc caaagtttgt 32340 

gggcttttca ttttgattgt attgaagata tagatgaatt tgggaagaat tgatataAca 32400 

ggattgaatc tt'tggattca tgaacgtagc ctgcatttgt ttacttaggt cttctttatt 32460 

tatctcagtg tgttttgtag tttaatgtac agatttgcac atcttttgcc agatatatcc 32520 

ctaagaattt ca^tttttga tactattgta gatgacattt aaaaaaattt caagtttttg 325 80 

tttgttgacc taggcatata tttgactttt taatatacta accttgctaa acttatttat 32640 

catctagtaa cttacaaaat atattcctta ggatttccta cataaacaat catgtcattg 32700 

ttttagaaat aacagtttta ctttgtcctt tttaatcttg atggctttta tttctttttc 32760 

ttgctaaatt ttctggctag acctcctagt acagccttga ctagaactgg tgtgagggaa 32820 

atcctttcca tattcctcat ctttagggaa aagcactcat tcttttatcc attctttagt 32880 

tcctagcccc attgcccttc ctaaattttt tctcatcatt ttccttcatc acaccttgtt 32940 

ctttttcttt gcaatcatat catgatatgt aacgacatgt ttttatttat ctgtttaatg 33000 

tatttctttt cctcacttgt ccatgaaggg aaggaccata tgtgttgtta tcctttgtgc 33 060 

agttcctgga acataataag tatataagaa atagtttctg aattagctgt gaatgaattc 33120 

atgccttcct gctgtctgtc aatgttcttt taaattaaac atctaagaca gcaaataata 33180 

ccacatgagt tattaacctg agaaataatc gttttattta taaatgactg agttgaaagc 33240 

tgatagccca cagtaattgc tttcatggct ttgaatataa accttactgt tacaaaacac 3330.0 

attttcatga aaatgaatgt gtggtgtttg gaactagctt taatgtttgt cttcctgttt 33360 

ttccttctag ttgctataat ataataagga attttgtatg tttttcctaa ttgtacccac 33420 

ttttctacat tttcttaaca gatctggtga atcttcatta ttaaatataa ttatacatat 33480 

aaattattgt ttaataataa tattaattat taaaaataat ataaattatt aaatataaag 33540 

atacatataa tattatctgt taatttctaa gttaggtgtg ggttctgaag actattatat 33600 

gaatgaacaa aaagcttgca tatttgcgtg gaagctgaaa gtacgaaatt tttagatacc 33660 

attataccag tatctaaaga aaaaattcag taccacatag gtttttaagt aggagctgta 33720 

tgatcatagg tcatccagat gaaggaaggc ttctgtacca gacgtacaga ggtagacagt 33780 

gttgtctgag tactgtctga gatctggcaa gaatgaatcc aataaacgta gttttctccc 33840 

atgagctcct gticttgtttc ctgtattctg tttgtatttg aaaagatttg gtgtgcataa 33900 

cttatttttg tcttttggct gtcaatcaaa gttattagtg tagtttttgt aactcagttc 33960 

tcaagctagg agbttttgct gtataatttt aatgtttctg tttttacttt cctaagcaga 34020 

taagcgtaaa aajcttagact aattgattac ttattaaacg tccagcttga tattcttctt 34080 

tatattattt tagtttcagt ttatataaca aatgaggttt cttataaata aaatttaaaa 34140 

tgcactaaag gagctgtgtg aaataggaat tctgtgtgaa gcttttgaat gtgaacattt 34200 
agaacgtttc ac'atggtggg aatttactat atgattttca tcaaatgagg tactttttag 34260 
tgttggtact taiacgatact gatttctaaa atttgtattt ctaaaaatga cgtattacag 34320 
gatctgaaag ggbaaaaact cattgaggct ttgtatgagt cagcgtttca tggcctattt 343 80 
ttaattagtg aattattagc atataattag aaatgttttt agattcttca tggctgacct 34440 
accaatgaat gtagcactgc atttaaaata tagttcacgt tatgttcata cttaattgtt 34500 
gcattttgtt tgcccctctt gaaacgaagg tcacatgtaa ataaatatac attttctcct 34560 
actgtaggaa atactctgtt agcattagta ggtttagctt ttttaggtta acaataacaa 34620 
aaacaaagct cacacaaaat aaaccaaatt tgctctatgt cccacagatg tatcttgtga 34680 
tttttccaga aggtacaagg tataatccag agcaaacaaa agtcctttca gctagtcagg 34740 
catttgctgc ccaacgtggt aagtaaaaat ttgagtgttt gaacaaataa ttttcaaaga 34800 
taataacatt tttagttttt cttcctggaa aagatacttt tgttttacag ttgaaggaat 34860 
gaatgtattc attccttgaa ttagtgtaca tattatctct taggaaatga agtttcttct 34920 
ccttaattca ctttcatgct attattacat atatctgaga aattaagttg aagtgcttgt 34980 
tacgatacat attcttgtgc catggattta tttaaaatct atctaagtac atgattatgt 35040 
agatggaagc tttttctaca gtgtatgggt tatatgtaat ggagcttctg ttttgtaaga 35100 
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tgacagacct aagttggagt ccaaactcgt acttttatta gctgtatggt tgcaacttgg 35160 
aagttgtgta atgttgctga gcttgcttct tcatctctta aaagaacata tgccttataa 35220 
gtagatctaa atctgtgtga ggattagatt agaaaatatg tcaagtttct attggagaag 35280 
ttacacaaag ttggtccaca gtgcttggaa gctgttaatg tcttcaacaa tggtaatgtt 3 5340 
cttaatatcc atattttaga aaattgaata attggtacac caataagcta tgcaatttaa 35400 
ccaaattggg aagtatacag aaaacagtgg ctatgctatg ttcttagagg tgtctttgaa 35460 
gcttgactgt gatttagtgt gtgatctcca tatgttgata gtcactcact gagcaaatac 35520 
cttgttggtg acattacagc agggcctatg acagtgctgt ctaatggaac tttctgcaat 35580 
aatggtaaag ttcttcatct gttctgtcca gtgtgctggc tcctaccaat gtggtttttg 35640 
agcattcaac at^tgactag tgcatgaaac taatttttaa ttttatttaa ttttagttta 35700 
attaaaaata agggggagtt tttacaaggt gcttacaaga gcagatatgt cataggtata 35760 
tgacatcatt tgtaacagta cttttaaaaa atgccagttt gtttttaaac acatgtccta 35820 
ttaagtaagg agjtgcttcag aataggaggg ttcagttggt ctccccatct gccagctctc 35880 
ttttgacttt cattgcttcc tctgtctaat agacatgacg ttctgtcatt tcagttgctc 35940 
ttttgcaatg ccattgtctc ttttgccctt ttcacattta ttaaacagaa caaaacaaaa 36000 
accactctcg aatctgtagt ctacctttgt tgtaagcact tttt'ccagta ctcactctgc 360€'0 
cctcaatttg ttttggtctg atttgaaatt ctctccctag acttctgtgg ggctgttctc 36120 
cattatcctc cckactctct ggcgattact tcctagcctc ctttccagcc tctttctgct 36180 
tcatttctcc ctgctacatg tgttatttcc agtgtcaggt tttggtgttt gattaatttc 36240 
actttttgtt tctcatggtg gccttcctct aaatccatgg ctttagccat cgtttccttg 3 6300 
actgctgatg actcgcaaaa gcttcctccc ctccatgtct ctctgcctaa ctctggaccc 3 6360 
atttgtacaa ttgtccatta gagagcttcg cttgactggc ccaaaaggat gtctcaaact 36420 
cagcatattg aagatagaat ttatccttcc atgcatacac tcatatttct tgtcttggta 36480 
actccatcat tcagtttttt tgcctaagtt ttattcacaa aaagaacaaa ttgatagcag 36540 
ttgcatacct cttataggaa acttagacat ggaggaagaa gctgttcaga tggggtcctg 36600 
cagaagtgca ggcactgtgg taatatttaa acttttctca gctgttcgaa gggttttgtt 36660 
ttaactaatt ttccttagac ttgttttagg tatttggctt tctaatggtt ataagggatg 36720 
tggaattaaa tgtatcttaa tctgccacct ggacccatta aagtaagccc ctatggtggt 36780 
tttttttttt aattgccatg gttaaaacca tagttgctag cgaaggtgac atacttaagc 36840 
tttttgaact ctcttaaaag aaaacagaaa tttaatgatg tgtctataat ggcaaaccag 36900 
atacctagaa tttccatgtt attcataggg tgaataacac tggcgattgt agagatttga 3 6960 
gagttctttc aaaacaggag aacaaaggga ataagctaca aagcaatttt tttctttgta 37020 
gacttaactg aataaaaatt atttttatgt ctcaaacatc atatgaacaa atttagttgg 37080 
caaatggcaa gctaataata ttttataata taggatatta atatacttaa tattacaaaa 37140 
gtgcttcata attagaaaag acataaacta gaaaaatggg aaaagggcat gaataagaaa 37200 
ttcaagagat ackaatgacc cacacacttg aacaaatgtt tattctttct cataatcaaa 37260 
gaagtagaaa ttaaatgaat actttgaagc caacttctga gaaagcatag caaacaagaa 37320 
agctagtgct cagctttgtg tggtaacggc actctcgctc ttaagaaggt gtgtttgctc 37380 
cctgtggctg ctbtcaggca gggccacaaa cttggtggct taaaacacca cagatttctt 37440 
ctcttacatt tgagaagtct gaaatgggtc ttactcagct gaaatcaagg tgttggcagg 37500 
gctgcagtcc ttitgtggagg cttgggggga tcttgttctc ctgtacgggg tcctgtgctt 37560 
ggttcggggt cctgtgcttg gtctgggatc ctgtgcttgg ttcgaggtcc tgtgctgggt 37620 
ccagtgctct gcttttacca ccttgaagtt catctggaaa tggcactggc tcgcccacac 37680 
catatagctg ac'tctggttc tccctcctcc tcactcgctc taaacctgtg tttttggctg 37740 
atttctaatc tctctttcct tggcccttct gcagcttgca gggccttctg cagctcttgt 37800 
ctgccccagc cccggggtct gcccatccca gtgctgggct gttctgttcc tgccctgcct 37860 
ttcctcagcc cttggcaacc ctgtttgttt tctcccttcc ttagcagtgg agaacatcgt 37920 
aagatcaatg ctgactgcct tctgcagcca agccaggcca tttcatttca gccgagccaa 37980 
gtctgtgtgg agcagttctt ttatttttct ccttttgact acctcatggt tttcacggat 3 8040 
ttttgttctc ttcacattca aggatttttt gctttcagaa agttatattt ctctggaaag 38100 
agtgcaccca atatcccttt tgatttcaaa atcttaatgt ggagtctctt gacttggatt 38160 
tctttggaag aaactgctga agctgccatg tctaagaaga aaactttgga gaaaaatttt 3 8220 
cttcttagac atggcaacgt caacagtttc taagctcttg attccgtcta ccctgtctcc 3 8280 
atcgttgcct cagtcatctg ccttacttct ctgcaggggt ttctcccagc ttgcaaatgt 3 8340 
actccaattc tgaaataact aagtctatag ctgtgcaaag agaagtctgg gccccttgct 38400 
ttcttgtgtt tgactccatc cactctccag aaatgaatcc cacttctcac ttaaccactg 38460 
acctccaaag catcgtatca tttgtgtcag ttgtcatatt tgttaacttt cacataactt 3 8520 
ttgacattat ttataccttt ataaccagga aataatttta actttattgt agaaataaac 3 8580 
aatggagtat aatttttctt gttgaagata aatatcacct cctcttcctt taaacatctc 3 8640 
ttccctttgt ttittgtatta cattggtttc cccccttttt ttatttcctg ggttgtcgta 3 8700 
ttccctgtta ttlatttttac cttttttttt ttaatgtgga tgtttccgga gtctgtattt 38760 
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cttgcctttt catcttctgc cctttattat tctcagccac tgccattact tcagttatcc 38820 

attcccatgg tttccacatg cttagcttcg gttgattctt gccattttac agaccatatt 38880 

tccaactact tctagaatgt tttgttcctt cagcctcagt atgcccaatt tgaactcatg 3 8940 

ttctctctcc ccbttctttc ttccttcttt ctttcgctct ctctcccttc cttcttttct 3 9000 

ttccctccct ccbtttcttc cttccctcac tcgttctctc ttgcttgctt gctttctctc 39060 

ctctctctct tttctttctg ctttctttct attcttctcc ctccctctct tccttctctc 39120 

ccccactccc cakcttccag gctaaagcag tcctcctgag tagttaggac tacagacata 39180 

cacgtgccac cgcgcccagc tccgtgttct ctttgtttcc ctgcctcctg ctcttccact 39240 

tatctttgca tg^caggtgg gtgcacgcag gcatgctctg catgtcttcc tcttggccat 39300 

tccccttcta gttatggtgt ggctttatct acgcgttctg gagcagaagc ctagtcacaa 39360 

agctattttt ttaaaacatt catgataatt catttccttt tatgttttaa aaatactagc 39420 

tttctgtctt tatttcctta ctaacttact tggatgccag taattagttg ttttagtgaa 39480 

caccacagag tgatattttg aaactttgga cttcataaag ttggatgagc tccagtagca 39540 

aagaaggaag tgttaactag tttaactgac aaataaatgc ttcccagctt ggtgtgcgat 39600 

tgagattttt gttgcaagtt tgtgaatcaa tttaactgcc cctgccctgg ggactaaagt 39660 

cagatacgtg cttgtgggaa tctttgtctt -tcccacacca ccctgcattt feaaaacctct 39720 

tgtgtgggac agtcccacca tgtaatagct gttcttcctt actcagctac tttccctcca 39780 

gagaggccag tagaaaatct agactagttt tttatagtct attttcatgt cacttattga 39840 

gagctactgt tttctgttaa attgtcagta aatattttaa tcaaggaaaa gggaggcaat 39900 

aggaaggaga gaagaacaaa tccttaaccc tagtaggaac ctaatgaatg ggatttgttc 39960 

tggataattg cagtagtccc ccagctaaag aaccttttaa aaatatgtca gatataccca 40020 

agaggattga aatcgtatgt tcatacaaaa gcttgttcac ctgcagcctt catatgcaat 40080 

tcctatgaat gttcatagca gcattattca taatagccaa agtatggatg caacccaaat 40140 

gtccatgaag caattaatag gtaaacaaaa tgtgatctgt tcacacagtg gaatactaac 40200 

tattcagcca taaaaaggaa tgaagcactg agtcctgcag ccacacagat gaacctcaga 40260 

tccatgctga gcgaaagaag ccagaaacag gaggccatgt gctgtgtgac tgtatttcta 40320 

ggaaatcttg agtcaccatg ggcaagatgc tatcaccttt gttcagtggc cagaagcgag 40380 

ggcactaata tttacccttg ccggggtcta ctagattgaa gcgtttccgc taggccataa 40440 

acttccaaca cg^tgacttg tacatgtaga tatttgatca atatatagca aatgaatatt 40500 

gatttaaaca ga^aaaggca agtgagagtg ctttctaaac ttagagccct aaatatatga 40560 

ggttgtggaa tt^atagatt ctgttgtgtg tgtttgaggg aatttaaaaa taatttagat 40620 

gttaaacagt atattgtgga ggtgttttgt aactaattaa tgacggcact gaattgactt 40680 

ctaggccttg cagtattaaa acatgtgcta acaccacgaa taaaggcaac tcacgttgct 40740 

tttgattgca tgkagaatta tttagatgca atttatgatg ttacggtggt ttatgaaggg 40800 

aaagacgatg ga^ggcagcg aagagagtca ccgaccatga cgggtaagtg tgttcacgca 40860 

cctgaaatgc ctgtacacgg tatatacagt gcacatgttt atgtagaatt cagttttaca 40920 

aagtaggtta agtgtacttt tttcctccat tacatttacc cggtatattt ttcaagatgt 40980 

tattaagatg taacagtgga gatttcatta gtcctgcaaa gtgtggtatt tcttggctgt 4104O 

cgtgtgagtc ctgtggactc accaattatc attaatccag cctctttcta ctcaaagttc 41100 

acacttaaaa ggaaagctct gtaaaaggga ggaagacgtg aagaaggagc acgcctggca 41160 

gtactgagtg cacgttatta gtcagtgctg cccttttgct gtatttttcg taaaatattt 41220 

attaaatttg ggtgtcattg tgacaagaag aaatgcagtt aagtgtgacc tttttttttc 41280 

cccaaacatg ttaggtttta agaacctttg agctattgtc agatataacc agaaaaaaat 41340 

agaattttaa gtgagcagga taacttagtt aaactaacca aacatagtgt tagctgttag 41400 

agaaatgtaa acatggaaat aggcaaacag ggaagtgtgt ggagtttctg tttccttttc 41460 

aaaatatctg tttgagctgg ggttgagaga gaacactagg cttcatgggg tttttttgtt 41520 

tttcgttttt tgttttgaga caagagtttc gctctgtcgc ccaggctgga gtgcagtggc 41580 
gcaatcttgg ctcactgcaa cctccgcctc ccacgttcac acgattctcc tgccttagcc 41640 
tcctgagtag ctggaactac atgcgtgtgc caccatgcat gactaatatt tgtattttta 41700 
gtagatatgg gatttcacct tgttggccag gctggtctca aactccttac ctcaggtgat 41760 
ccacgcacct cggcctccca aatgagcttt gtgtttttac ctcatcagct gtttggggtt 41820 
gagccactat gtatgtcagt gtgcttgtat cagtaggatc tactgagggc agatgttcaa 41880 
aatatgagcc tccagcacgt tttacatgga aaccctcacc tgaagcattc gtctgaagtt 41940 
gatgtgcctt ggaaatttta tagagtaata tttttaacta caacaaaaca tttataaaag 42000 
tagacattat tajaagcattc agaagtgagc aaggatagaa attattctgc ccaaccttac 42060 
acgtaggcct tcjtagacgta gtactgtgca ccgttacatt atctaacact gtctgtgtgt 42120 
catctttgga tgttagggat ttttccaaag ttcagtgaga ttatagttgt caaatgatta 42180 
gtctgttaaa talatgataag atgagggtca ctcaggtttt aaaagaaaag ctctttgact 42240 
gaaagagaga gcjagctgtct actgcagaaa gttagggagg gaggctggag gagtgaggcc 42300 
caggggctag ctjagtataaa aattggttat ggtcgaagga aaaaaaaatg taacatattt 42360 
atatctgaaa gajtgattgtt ctcataattg tatataacac agagtaattg taaagtagaa 42420 
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aactaaggtg tttttcattt tagatgtaaa tgtttagaat atgtaatgca tcagtttaaa 42480 

aattaaaact gtkcgaaatg cacagtgaaa cgtcttcctt gctttccacc ctgctacctg 42540 

gccttccctt ctbcttccta gcgataacca gttttcttaa tttgttgtgc gttgtatgtg 42600 

caaatttaag tatatcttct tattctacca tccctccctt cttacagaaa agtggcatat 42660 

taatattttt ctbttttaaa ctatcgaagg agttacttac ctatttttgc atttcaaaac 42720 

agacagttca tcaagattgt cgttggttta ttaaacatag tttaagatta aacaagtgtt 42780 

tataaccaat gaaaaacaga tagactcccc ataataacct tgtttaaatg ctgctacttt 42840 

tatcatgtcc cctcctgtct aagaacccct tggttcagca gagctcatgg gtaaggccag 42900 

cctctgttgc ctgccatcgg aggaatgcgt tccagccgtg atctctgcct tgccttcgct 42960 

tcctcctgtg ctgtgccgtg aagcctcggc cgtggtgaag ctggctgact gagtcctcct 43020 

gcaccccatg catattcagt agttgaaggc tttgtgtggc caatcctgct ttccacagga 43080 

aaccaccctc tcttttgttg ccctcatcca aggctactgt tctcccagag tgacaggcgg 43140 

cacctttccc agcatagcac tgtgccttct cctgcccctg ctcttgcagt actgctgtgg 43200 

cactgatggc gtgtgttaca gtgctggcac ttagcacagg gctctgcctt tctctcttcc 43260 

cagccgcatc ataagtgcct tgaggaagcc aaaaccttct gtgagttgca ttgcctgggt 43320 

tccaacctcc cactgccctg cttatcctct gctacatgtg agctgactgt ggctttgggg 433 80'' 

tggtcactgc ctatgtgtat tcattacaaa ttgtctcctt ttgaaagatt gacctttctg 43440 

acttacccag ataccataaa gaaaataaaa tcttatcact tcagtcaagg ataaagtatt 43500 

tctgaattaa aggaaaaata caccagagta aaatcaagac tgaaagacaa actgggaaat 43 560 

tatttgcaac ctagatcata gaaaaggggt catttccttc ttgcgtaaag tgcacttaca 43 620 

aattgataag aagatgactg ataactagaa agaaaaatgg gtaaagaaca acaatagaca 43 680 

tttcacattt aacctcattc atgataaggt aagtgcaaat gaaaactaca ggggatacct 43740 

tttttttttt ttaatccatt agattggcaa acatcccaag gtttgatcat aggctcagtg 43800 

ggtgagattt aaptattatc aggcattttt atactttgct gttaggaatg caatgtagta 43 860 

caaacctttg tapaagttgc tttggaaatg tctctcagat gtacaaatgc attcacattt 43920 

tagatttagc attcccgctt tctgagacat tattcaacat gtatacgtgt gcacataaga 43980 

tataataata acacgttttt ccttctagtg tgttgctttt aacctgtagc ttgaaaaaac 44040 

tctgctttca ttgttttttt ttgttttctg tcactggctc agccctgctt tcaattgttt 44100 

atatgaattg atgggtgttc tggtctggtt ataatctact ttagtttaag agtcacttta 44160 

aattatatga catctgatat aagttgtgtt aggtagaaaa ttctgtaact tggaatactg 44220 

taagtacttt gtggccacat ttcattagta ttaaatatta tctctatata tagtaggcta 44280 

tttaatattc atattttatg atgcaattaa gaaataattt ttttctgaag ttggtagatt 44340 

gttgatatgc catggcccag tgtttctcaa agcattctgg gggatcactg tttgtcagaa 44400 

ttagctgcag tgattgttga acatgcaggg cctctgctcc actccacgtt gctaccagga 44460 

cgctctgcag gtgagagctg ggaagctgta gaagctgcag tgctaacaaa tgctacagga 44520 

attcttgtag tcaccttcat gaggtcttat gttgaggaga ggcagccagt agtgtccctt 44580 

gtccttcccg ttttatggtg taagtttcat tttaagggag gtataaatca aagcccacct 44640 

gggcattctc tcatggttca ctgcttcttg taatcatgga agatgtcatt gcggcagaga 44700 

cgaaacagtg tagtttgatt actattgatt tttttttaat tatttttctg aagtggctgt 44760 

tgtaatgtaa taaattgtgt gcttaaggac aacctttggt attctatttg agtattgtgt 44820 

atgatcctag ttaagttttt tctaccagta ttttcatatt acaacatatt tactttccat 44880 

ttctattaat atttttatat ttaaagtatg gaggccgggc acagtggctc acgcgtgtaa 44940 

tcccagcatt ttgggatgct gaggcgggtg gatcacaagg tcaggagttc tagaccagcg 45000 

tgaccaacac ggtgaaatcc catctctact aaaaatacaa aaattagccg ggcacagtgg 45060 

taggcacctg taattccagc tactcaggag gctgaggtag gagaatcact tgaatccggg 45120 

aggcagcagt tgcagtgagc taagatcgtg ccactggact ctagcctggc tgacagagca 45180 

agaatccgcc takaaaaaaa gggatcaggg aagaggggat tacagataac ccaaagaaga 45240 

aggaaaaatc tcbacaagtt cacctgtcca gcggtaaccc caatttggat attttccttt 45300 

aacaatttgg atWttttcct ttaaatcctc ttttttataa tgtctatatg ttggagagag 45360 

tatgtgcctt tapgtatttt ttaaagatga gatttctgtg tgtgtctata tctcctgttc 45420 

ttcatatttt cttgtgtgtt ataaacagct gtacatgtca gtatatatac ttccgtaact 45480 

tttttttaaa ggctatatag tgttcattga tgtgatttaa cagcagttat ctccccggct 45540 

tcatcttgtt ggaatgtggg tcctgtgtgt tgccttcaga gcaaatgggg cttggttttg 45600 

cagcaagtag ac'ctgtgacc tgtacgaata gttggaagac tttctctatt acccaagcgt 45660 

atcagtatac ttbagtgcct actagaaatt tatgggtaga aaaacaataa tatcttagag 45720 

tattttttcc tagattccct aaggtgctat agggtgattt ttactcatgt aacatgaact 45780 

atgcttcaac taagatagtt tttgcaaatg tggatatata agtactttat taaacctata 45840 

ggaagtattt ataccactta tttcctccct tcagtgttag aacctcctaa atggcatttg 45900 

acattgaact gctttccact ttgtcgcatg ctcctctcat tgtccctacc tgggtcctga 45960 

accttaggga cttggctgtt atagccccac catggctacg ctgggccttg gtcgtctctg 46020 

agacttagtt tcttcatctt acaaggagat aataacagcc cctgcctgcg tagaattgca 46080 
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gagatcaaat gaaataatta acatactcaa aagcatgccg taaacacatt ctgagcacat 46140 
gtacgtttta ggaaaaacaa aaggacccat gcacatttcg gagtgctttt gtctcagcag 46200 
cactgcctct tcttccaaag ctgacgtctt agtagaggcc ctgccacgtc ctgagcactg 46260 
tactccacga agcattctat ttctgacatt cgaaatgcag tctgttccat cttccttaca 46320 
atctgtatgc cagcacttga aataccgggt atctgcagtg ttgaccaggt gattacttaa 46380 
ttatggaaat gttgaggtgg agatctagat aattcagtga aggcaggaaa attggtgtcg 46440 
gaatctgtct ttttatgtgt cagaaataga aataagatag ggtgagaagt aatttgtggc 46500 
taaaacacta taatagctaa cacatagtgc atactgtgtg ccaagcactc ctgtaggtgc 46560 
ttgaaatctt ctattattat tatccctact ttatagactt gcacccttag gcacagagag 46620 
gcggacagtt gtccaaggtt accccagagg tggagatcca ggctacctga ctccaccatg 46680 
tgtgctcttc cctagggcac agttgtgctg ctaaaaatac tttttaagca gttctttgat 46740 
tattcagatg atagtactgt aggaaaatta agacaaaaat aatgaaaaat taaaatcttt 46800 
attttagtgt tttgcacatg tattattaaa gccagtttac tcctggaagt gtgtaagaat 46860 
acagggtatt tttgatcacc taaatgctgc atgttactaa gagctcgaca ctgaagtcaa 46920 
gaagagcagt tgpagagagt acttagcaaa aacgggaagt gtgtggggtt gaaggagcaa 46980 
agacaagtct tcctcggacg gtggagtgta ofaattcatca tttctcagaa cacgtctttg 47040 
aacgcatttt caatttgagg ccaaaggtct cagcctccca ctcggcatac ctccctacct 47100 
tagtcagctc ttaaatctta ggaatatttc tttgttcttc aaggaactta aatatgttaa 47160 
cattcttacc tgtccacagg gagcccccta caaagaaggg agtttctagt ctccgttctt 47220 
tcttggaata aataatagcc tcataccttg tgcaatcgag gctgaaaaag actgtctcct 47280 
tttttcaaat aagcaagtct tagaaactac agttgtttac agggctcatg gctattccac 47340 
agtaataatt ttggttcttt taccaattat ataatatgtt aaaatatggc aagtatcagg 47400 
aaagcaagga gtggcaatga ttagaaacca atggccaagt tagagaggag gggcaattgc 47460 
tcccccaagt ttgttgtggc tgtgtagcag tcagtgacga gaagctgtgt gtcaggcgac 47520 
aagcaaagtt gaggattatc aggcgcctgt gagtgcccag ctgtgtgcca ggtcaggagg 47580 
tgccatcgtg agccagacca gcttcctctc ggcccctgtg gagctcgcag tctggtgggg 47640 
aggcagcagt caccatggtg acaggtgaca cactaggatg gggctggtgg tggtaggcat 47700 
ttgcgggtcc cttcagagag gtgagtatgg acttagagga ggctccagct tcctattcct 47760 
gggctgtcta tagcactaaa agttgtcaca tgaaaaataa catttggtac tattgattta 47820 
acttaatgac ttatgtaatt gtagttgact tagaaattat aacatgctct tctacttcag .47880 
cttgaaaccc ccaaccacca gtttataatc cttttttttt aacttttgtt tatttttcct 47940 
aaggaatctg tactttttct tcattttaca actttttttg tcctgttacc ttattttcat 48000 
ttttacttta tatgaccatg agttctaaaa tagtaaaaaa aaagaattat ttttgttctt 48060 
tgttagaatt tctctgcaaa gaatgtccaa aaattcatat tcacattgat cgtatcgaca 48120 
aaaaagatgt cccagaagaa caagaacata tgagaagatg gctgcatgaa cgtttcgaaa 48180 
tcaaagataa gtgagtaaca acagttccag cacttccgga acttcggttc aactagattt 48240 
cagtatagtc aacaatttga aaccaatgta aatggttata ttgtctcaag aatacatttt 48300 
ataaattcaa atcaaatttt atgcatgtct gatcgtgttt taaactttac ttgtacaaat 48360 
cagtctaaaa gaacttgtta cagtgggccc atctacttgc attgatagta tttcttggac 48420 
aatactacgt gataacatag caaattaaat taaaaacaac aacaaacaca caaaaaaact 48480 
ttccagtgtc ag^tgcccgg acctacctgt caggtcacat aaagtggtgt tactgtgtga 48540 
ggtctggctg ttgggccagt gtgcgcagaa aagcaaggga ggggtagagg actatgcgga 48600 
cgtgcaggtg gacatgatgc tgttatattt gttggaaata gaagggggca gttgacagcg 48660 
ttatatccaa agtgtcttct gtggttaatt atattcagaa attttagcca attgttttat 48720 
tctctaaata tgtactttct gctcaagaaa ctatcattgt tcttcttttc cttgttttac 48780 
agtacagtgt ttttaattaa ccctcctggg ttaactttac caggtgaaaa tgattaaaag 48840 
tgtaataggt takcaatgaa actttaagct tctatttttc attgactctt aactgtacat 48900 
gatgtaatgt attcagcgag ccattcagga ccactttggc ccatggaaga aatttaaaag 48960 
taagatctac at^tattgac atgaaaatat gttctcagaa aaaagactaa tgtatttaat 4902 0 
gtcctactta ttttataagt atttagaata cctctggaca ttttaaaaca atgattattg 49080 
ctagggtgtg tgatttataa agcaatagaa gcgctttccc tttctgtttg tgttttagat 49140 
tattatatcg ggtatgttct gctatcataa ctttacaaat cttatgtaat atgggaaaat 49200 
gagttaacta tgctgttttc cttcttttac ctgcctttct aattctgtgg gaataaaggc 49260 
gtttttgaga cagcccaggt gtagtgagca gtccatatcc atggattcca cattcatgga 49320 
ttccaccaag cacagaccaa aaatactcag aaaaaaaggg ggctggctgt ggtggctcat 49380 
gcatgtaatc ccagcacttt gggaggctaa ggcaggcaaa ttgcttgagc ccagaagttc 49440 
aagacagcct gggcaacatg gcaaaaccct gtctctacag aaaatacaaa aattagccag 49500 
gcgtgcacct gtagtcccag ctactcagga ggccgaggtg cgaggatcac ctgagcctgg 49560 
aaggttgaga ctgcagtgag ctatcattgt gccaactcca gcctggtaac agagtgcctt 49620 
ttttcaaaaa aaaaaaaaaa aaaggatttg ggaggatatg catatgttat attcaaatac 49680 
atgccatttt attcatatat cagggacttg agcatccttt gatcttggtc tctgccgggt 49740 
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atcctgggac cagccccctg tcgatacaga gggaccgctg tctaagaacc gctggtccta 49800 

tctttgactt ctggcggaat aggagctcca tgtaaaaagg aggagaagct gcagcgggtt 49860 

attagccatt tgtgagtcag gtcactgtaa aactttatca aaagtttaaa agacaaaaag 49920 

catcctcata aaatgcctta aaaccacctg ttgaaatatt acatatacaa ttcatgtata 49980 

ctaatcatag agcatattaa agatatttta gaagactaga aacttctatt aaaccaagtt 50040 

tctggatgtt tccgtattca tccttatttt ccagggacct gcataacttt tccagcgtgt 50100 

aatagctacc tgattgatat tttttgaatt gaaatactga agtgactaaa atctaaactt 50160 

tttccattct ggpcatagga tgcttataga attttatgag tcaccagatc cagaaagaag 50220 

aaaaagattt cctgggaaaa gtgttaattc caaattaagt atcaagaaga ctttaccatc 50280 

aatgttgatc ttaagtggtt tgactgcagg catgcttatg accgatgctg gaaggaagct 50340 

gtatgtgaac acctggatat atggaaccct acttggctgc ctgtgggtta ctattaaagc 50400 

atagacaagt agctgtctcc agacagtggg atgtgctaca ttgtctattt ttggcggctg 50460 

cacatgacat cakattgttt cctgaattta ttaaggagtg taaataaagc cttgttgatt 50520 

gaagattgga taatagaatt tgtgacgaaa gctgatatgc aatggtcttg ggcaaacata 50580 

cctggttgta caactttagc atcggggctg ctggaagggt aaaagctaaa tggagtttct 50640 

cctgctctgt ccatttccta tgaactaatg acaacttgag*- aaggctggga ggattgtgta 50700 

ttttgcaagt cagatggctg catttttgag cattaatttg cagcgtattt cactttttct 50760 

gttattttca atttattaca acttgacagc tccaagctct tattactaaa gtatttagta 50820 

tcttgcagct agttaatatt tcatcttttg cttatttcta caagtcagtg aaataaattg 50880 

tatttaggaa gtgtcaggat gttcaaagga aagggtaaaa agtgttcatg gggaaaaagc 50940 

tctgtttagc acatgatttt attgtattgc gttattagct gattttactc attttatatt 51000 

tgcaaaataa atttctaata tttattgaaa ttgcttaatt tgcacaccct gtacacacag 51060 

aaaatggtat aaaatatgag aacgaagttt aaaattgtga ctctgattca ttatagcaga 51120 

actttaaatt tcccagcttt ttgaagattt aagctacgct attagtactt ccctttgtct 51180 

gtgccataag tgcttgaaaa cgttaaggtt ttctgttttg ttttgttttt ttaatatcaa 51240 

aagagtcggt gtgaaccttg gttggacccc aagttcacaa gatttttaag gtgatgagag 51300 

cctgcagaca ttctgcctag atttactagc gtgtgccttt tgcctgcttc tctttgattt 51360 

cacagaatat tcattcagaa gtcgcgtttc tgtagtgtgg tggattccca ctgggctctg 51420 

gtccttccct tggatcccgt cagtggtgct gctcagcggc ttgcacgtag acttgctagg 51480 

aagaaatgca gagccagcct gtgctgccca ctttcagagt tgaactcttt aagcccttgt 51540 

gagtgggctt caccagctac tgcagaggca ttttgcattt gtctgtgtca agaagttcac 51600 

cttctcaagc cagtgaaata cagacttaat tcgtcatgac tgaacgaatt tgtttatttc 51660 

ccattaggtt tagtggagct acacattaat atgtatcgcc ttagagcaag agctgtgttc 51720 

caggaaccag atcacgattt ttagccatgg aacaatatat cccatgggag aagacctttc 51780 

agtgtgaact gttctatttt tgtgttataa tttaaacttc gatttcctca tagtccttta 51840 

agttgacatt tctgcttact gctactggat ttttgctgca gaaatatatc agtggcccac 51900 

attaaacata ccagttggat catgataagc aaaatgaaag aaataatgat taagggaaaa 51960 

ttaagtgact gtgttacact gcttctccca tgccagagaa taaactcttt caagcatcat 52020 

ctttgaagag tcgtgtggtg tgaattggtt tgtgtacatt agaatgtatg cacacatcca 52080 

tggacactca ggatatagtt ggcctaataa tcggggcatg ggtaaaactt atgaaaattt 52140 

cctcatgctg aajttgtaatt ttctcttacc tgtaaagtaa aatttagatc aattccatgt 52200 

ctttgttaag tapagggatt taatatattt tgaatataat gggtatgttc taaatttgaa 52260 

ctttgagagg caatactgtt ggaattatgt ggattctaac tcattttaac aaggtagcct 52320 

gacctgcata agatcacttg aatgttaggt ttcatagaac tatactaatc ttctcacaaa 52380 

aggtctataa aatacagtcg ttgaaaaaaa ttttgtatca aaatgtttgg aaaattagaa 52440 

gcttctcctt aacctgtatt gatactgact tgaattattt tctaaaatta agagccgtat 52500 

acctacctgt aagtcttttc acatatcatt taaacttttg tttgtattat tactgattta 52560 

cagcttagtt attaattttt ctttataaga atgccgtcga tgtgcatgct tttatgtttt 52620 

tcagaaaagg gtgtgtttgg atgaaagtaa aaaaaaaaat aaaatctttc actgtctcta 52680 

atggctgtgc tgtttaacat tttttgaccc taaaattcac caacagtctc ccagtacata 52740 

aaataggctt aatgactggc cctgcattct tcacaatatt tttccctaag ctttgagcaa 52800 

agttttaaaa aaatacacta aaataatcaa aactgttaag cagtatatta gtttggttat 52860 

ataaattcat ctgcaattta taagatgcat ggccgatgtt aatttgcttg gcaattctgt 52920 

aatcattaag tgatctcagt gaaacatgtc aaatgcctta aattaactaa gttggtgaat 52980 

aaaagtgccg atctggctaa ctcttacacc atacatactg atagtttttc atatgtttca 53 040 

tttccatgtg atttttaaaa tttagagtgg caacaatttt gcttaatatg ggttacataa 53100 

gctttatttt ttcctttgtt cataattata ttctttgaat aggtctgtgt caatcaagtg 53160 

atctaactag actgatcata gatagaagga aataaggcca agttcaagac cagcctgggc 53220 

aacatatcga gakcctgtct acaaaaaaat taaaaaaaat tagccaggca tggtggcgta 53280 

cactgagtag ttjtgtcccag ctactcggga gggtgaggtg ggaggatcgc ttcagcccag 53340 

gaggttgaga ttgcagtgag ccatggacat accactgcac tacagcctag gtaacagcac 53400 
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gagaccccaa ctcttagaaa 
gacacacagt aactcccaga 
ccaaagcagt attttgtgtg 
ttgtttttag ta^tgtttag 
tatctggtgg aaagctacaa 
aagatttttt ctbcctcctt 
taaagaactt gaaattacgt 
catgcaatca ttpgaatcaa 
ttgtcaaata ttcatgtaat 
ctgtacaatc gctaatttac 
tatttgacat cttttccaat 
agtcatatat gaggtcaaag 
gctggttatc ctgagcaggg 
tgaagtactt ctaatatact 
tttattgatg ttgatgaaac 
ttatatgtga atatgtaaga 
aaaaggtgct attgaaattc 
gtctctttat ggtattttca 
cattaattca gcaataaagg 
ttgtatatgt gataaagttt 
taaatactac gtcatatttt 
aagcctcttg cgggcatcat 
cttgaccact aaaagtaatg 
aacagttacc ttatcaaggt 
tgcataactt taaataaatt 
tttgggaggc cgaggcaagt 
cgtggtgaaa ccctgtctct 
acctgtaatc ccagctactt 
gaggttgcag tgagccaaga 
tccgtattaa aaaaaaaaaa 
tacctgtaaa gaggggagaa 
tagtttgaga ttbtaggtat 
attattattc ttgctaaacc 
attgtccatt atbttctgaa 
tgctaaacat ctttctggtt 
agttactttc tgtctgggtc 
ttcatttgcc catgaccacc 
ctttcagttt attgcagaga 
atccccatgt tttatcattg 
ttgtttatat aatggaaaca 
tatttagatt cttttttttt 
agtacagtgg ctcaatctcg 
ctgcctcagc ctcccaagta 
tttgtatttt tagtagagac 
acctcagatg atccacccgc 
cgcgcctggc ccaatgtatt 
agagaactag aactaaagaa 
ggagatgtgt cctggaatga 
tattgtttcc ttgccttggt 
gtgatcgttg gaatttggtc 
ttaaagattt ctgaggttgg 
agagctgtgc actcacttca 
<210> 180 
<211> 2000 
<212> DNA 
<213> Homo sapiens 
<400> 180 

gtggatctgt gactgttcgc 
gtcaggagct gggtttggag 
tggcaaacat tgcacaaaag 
aacaaattta caicataaaca 



atgaaaagga 

tatgtaccac 

tataattgca 

attgaagatt 

tgcaatgtcg 

ttgggccagt 

tatcacttag 

aattagtact 

taactgaatt 

tcagtttaga 

ttgtgtatga 

acatatacct 

gaaaaggtta 

gagggaagta 

agatcagttt 

tatgttct^c 

tgtgtctcca 

gaataaagtc 

aaaatatgca 

acatgttgtg 

aaaggttcag 

ctcatctcac 

catctgcaag 

aaatcccaga 

tcctgccggg 

ggatcacttg 

actaaaaata 

gggaggatga 

tggcgtcatt 

aaaaaaaaaa 

tatgtattta 

aaaaatacat 

aattcagttt 

gtgttttgga 

attcaagctc 

acaggtcagt 

tttattcttt 

agatggtgga 

gctagagtgg 

aagatgagga 

ttttttttaa 

gctcactgca 

tttgggatta 

ggggtttcgc 

cttggcctcc 

tggattctta 

tttctgtgtc 

atgaatacat 

tgatttggtt 

atctactaga 

gattaaggta 

caaatttgaa 



aatatagaaa 

aaaaaatgtg 

agcgcatagt 

gagtgaaata 

ttgtagtttt 

tttcattacg 

tataattgac 

ttggtcaaaa 

taaaaccttc 

gtagctacaa 

aaagtaaatc 

tgttattata 

tttttaggaa 

taatatgtgg 

ttccatccgg 

aattttataa 

gcaggcaaga 

tgacttgtgt 

tctcaaaaat 

tatatatgtt 

tttgtagtga 

tgtcatcaca 

catactgcca 

ctctaaaaga 

cgcggtggct 

aggtcaggag 

caaaaattag 

ggcaggagaa 

gcactccagc 

aaaaattcct 

cttcaaagag 

tcttatataa 

tatttgctgt 

actcaacaca 

gtgtatactg 

tcttgatagt 

ttcctcaact 

gaaaagccgg 

aaaatagcag 

aagaacctgg 

gacggagtgt 

acctccattt 

caggcgtgtt 

catgttggcc 

caaagtgctg 

aagaacactt 

aaactgttta 

cagtaaaata 

ttactgtgaa 

aaatgagaaa 

gtgttcccaa 

ttcctgctct 



tataaaattt 

aaaagagaga 

aaaataattt 

ttttcttggc 

gcatggcttg 

agtaactcac 

attatataga 

tatttacaac 

aactattatg 

ctcttcgata 

tattcctgta 

atatgtatac 

aaccacttca 

aacaaactct 

attattattg 

atgttcatgt* 

atacttgact 

ttttgagatt 

tggtgataaa 

gtattgccaa 

tagtaaacaa 

aaccccatgc 

ggttttggat 

gttggtgctg 

cacgcctgta 

tttgagacca 

ccaggcgtgt 

tcatttgaat 

ctgggcgaca 

ctcctgtttg 

ttcagggaaa 

ttttaacacc 

ctaaaatgtg 

tgattgtgag 

tgctctgttg 

tttcggacaa 

gcacccatct 

aattcccacc 

taactactgt 

cttagatcag 

tgctctgttg 

ccctggttca 

ccaccacacc 

aggctggtct 

ggattacagg 

tcaaattaaa 

gcaaatgtaa 

ccatacgtat 

ataattttca 

gaagttaata 

ggtgttctaa 

gtgttaggcg 



gcttattata 

gaaatgtcta 

taaccttaat 

agatattccg 

ctttataaac 

actttttgat 

gactatgtaa 

attcacatac 

aagtgctcgt 

ctatcatcaa 

gcaactgggg 

tataataata 

aatagaaagc 

caacaaaatg 

gttcatgatt 

ctttttttaa 

aactcttttt 

attggtgcct 

aagttatttc 

atacggctat 

gcagtgcact 

cacagcgtag 

agtttgtacc 

tgtcactaca 

atcccagcag 

gcctggccaa 

ggtggcaggc 

cctgcaggcg 

agagcgagac 

agctttccct 

tgactctcac 

aatgtgagag 

tgaataagta 

gaggatttgt 

agacatgcag 

ttaaccagtt 

tttataaggt 

caccgctgcc 

gagagatcat 

agaactgatg 

cccagactgg 

agcaattatc 

tggctaattt 

cgaaatcctg 

cgcgagccac 

tatcagttga 

gtagaagctg 

gttatgatgt 

atatagaatt 

gctatcttcc 

aacggcagcg 

ctgtgctagg 



53460 

53520 

53580 

53640 

53700 

53760 

53820 

53880 

53940 

54000 

54060 

54120 

54180 

54240 

54300 

54360 

54420 

54480 

54540 

54600 

54660 

54720 

54780 

54840 

54900 

54960 

55020 

55080 

55140 

55200 

55260 

55320 

55380 

55440 

55500 

55560 

55620 

55680 

55740 

55800 

55860 

55920 

55980 

56040 

56100 

56160 

56220 

56280 

56340 

56400 

56460 

56520 



aggaagagag gagcgggagc 
ataaagaggg aacaagagaa 
tttacaactt cgtgactaac 
catatttact gactttatac 



aggacagaca ataactgata 60 

agttaagttc tgtgttttca 120 

agtaatctgg ggtgattcac 180 

acagcaatcc taacgtgaac 240 
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acagaacctg ctttatcttt tcgcacactg ttctagtgta gagatgtctg gtctcagtta 3 00 

aagaaagcat aaggagcatt agttgtgcac actgtccaca cccgtgactt ttttccacca 360 

gtactaaacc tagtgcttct tacagtacag ggcaatgaca gccacagaaa gagagaagct 420 

ccttttactg tgtaatgctt cctgctggcc ttcaaatact tgttacttga gagatctcca 480 

ttcacctggc tttgtcccca aaggtcatca tctaccaatg atgttgttat ttgatgttaa 540 

tcatgtataa agaaagtagc taccatcctg gccctgatta gaacttccca ctgaaatacc 600 

gtcctgccta aaggtagcac aggtttccat tatggtggtg gtggggaggg ggcgggaata 660 

tatatatata tatatatata tatatatatg gtaaagcatt cggcattctt ttaaagtaca 720 

actatccttg aaaagggtta catattaaac catttttacc acagccaaag gggaggagaa 780 

agatccaaaa gtcctgtgga tctgctttaa catcaataaa acagttatcc acccttcgta 840 

gcttttagtg aaggctacaa aagtatgctt tttatggatt acacatgtgc acgcaactac 900 

tttaattact acagaaaaaa acgaggctcc ttattaaaaa aaaatcagaa acaagtccaa 960 

cagactctga ggaaatgaag caagagtgaa ttctgaaaag gtctaataaa cagtatggaa 102 0 

atatccttgt gggattgttc ttcagctatg cataaacatg taattatcat cattactgtg 1080 

atggggaaaa acacggaccc taattctgaa acaccctggt agcgagagac gggcaggagg 1140 

ggctgctgcg cactcagagc ggaggctgag gaggcggcgt ccccttgcaa aggactggca 1200 

gtgagcagat ggggacactc gagctgcccc gcgacctggg ccgagctgcc tacaacctgg 1260 

gcccaggtgc ctgcaagaat tagacctccg ataacgttaa cacccacttt ctcactgctc 1320 

taattgtgtg catcccggcg cccaggggct tgtgagcagc aggtgcgcgt tccaggcagc 13 80 

tccagcgacc cttaaacctg accgcgcgca cgtccggccc gagggagcag aacaagaggc 1440 

acccggaccc tcctccggcc agcacccacc ttcacccagt tccgtcagtc gccaccacct 1500 

cccttcccgc gtccgcagcc ggcccagctg gggagcatgc gcagtggccg gagccgggtt 1560 

gcccgcgcca cagcaggtag ctgtactgca actgtcggcc caaaccaacc aatcaagaga 1620 

cgtgttattg cc^ccgaggt ggaactatgg caacgggcga ccaatcagaa ggcgcgttgt 1680 

tgccgcggag ccccctgccc cggcaggggg atgtggcgat gggtgagggt catggggtgt 1740 

gagcatccct gagccatcga tccgggaggg ccgcgggttc ccttgctttg ccgccgggag 1800 

cggcgcacgc agccccgcac tcgcctaccc ggccccgggc ggcggcgcgg cccatgcggc 1860 

tgggggcgga ggptgggagc gggtggcggg cgcggcggcc cgggcccggg cggtgattgg 1920 

ccgcctgctg gcpgcgactg aggcccggga ggcgggcggg gagcgcaggc ggagctcgct 1980 

gccgccgagc tgagaagatg 2000 
<210> 181 
<211> 1901 
<212> DNA 
<213> Homo sapiens 
<400> 181 

taaaggttca gtttgtagtg atagtaaaca agcagtgcac taagcctctt gcgggcatca 60 

tctcatctca ctgtcatcac aaaccccatg ccacagcgta gcttgaccac taaaagtaat 120 

gcatctgcaa gcatactgcc aggttttgga tagtttgtac caacagttac cttatcaagg 180 

taaatcccag actctaaaag agttggtgct gtgtcactac atgcataact ttaaataaat 240 

ttcctgccgg gcgcggtggc tcacgcctgt aatcccagca gtttgggagg ccgaggcaag 300 

tggatcactt gaggtcagga gtttgagacc agcctggcca acgtggtgaa accctgtctc 360 

tactaaaaat acaaaaatta gccaggcgtg tggtggcagg cacctgtaat cccagctact 420 

tgggaggatg aggcaggaga atcatttgaa tcctgcaggc ggaggttgca gtgagccaag 480 

atggcgtcat tgcactccag cctgggcgac aagagcgaga ctccgtatta aaaaaaaaaa 540 

aaaaaaaaaa aaaaaattcc tctcctgttt gagctttccc ttacctgtaa agaggggaga 600 

atatgtattt acttcaaaga gttcagggaa atgactctca ctagtttgag attctaggta 660 

taaaaataca ttcttatata attttaacac caatgtgaga gattattatt cttgctaaac 720 

caattcagtt ttatttgctg tctaaaatgt gtgaataagt aattgtccat tattttctga 780 

agtgttttgg aactcaacac atgattgtga ggaggatttg ttgctaaaca tctttctggt 840 

tattcaagct cgjtgtatact gtgctctgtt gagacatgca gagttacttt ctgtctgggt 900 

cacaggtcag ttcttgatag ttttcggaca attaaccagt tttcatttgc ccatgaccac 960 

ctttattctt tttcctcaac tgcacccatc ttttataagg tctttcagtt tattgcagag 1020 

aagatggtgg agaaaagccg gaattcccac ccaccgctgc catccccatg ttttatcatt 1080 

ggctagagtg gaaaatagca gtaactactg tgagagatca tttgtttata taatggaaac 1140 

aaagatgagg aaagaacctg gcttagatca gagaactgat gtatttagat tctttttttt 1200 

ttttttttta agacggagtg ttgctctgtt gcccagactg gagtacagtg gctcaatctc 1260 

ggctcactgc aacctccatt tccctggttc aagcaattat cctgcctcag cctcccaagt 1320 

atttgggatt acaggcgtgt tccaccacac ctggctaatt ttttgtattt ttagtagaga 1380 

cggggtttcg ccatgttggc caggctggtc tcgaaatcct gacctcagat gatccacccg 1440 

ccttggcctc ccaaagtgct gggattacag gcgcgagcca ccgcgcctgg cccaatgtat 1500 

ttggattctt aaagaacact ttcaaattaa atatcagttg aagagaacta gaactaaaga 1560 
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atttctgtgt caaactgttt agcaaatgta agtagaagct gggagatgtg tcctggaatg 1620 

aatgaataca tcagtaaaat accatacgta tgttatgatg ttattgtttc cttgccttgg 1680 

ttgatttggt tttactgtga aataattttc aatatagaat tgtgatcgtt ggaatttggt 1740 

catctactag aaaatgagaa agaagttaat agctatcttc cttaaagatt tctgaggttg 1800 

ggattaaggt agtgttccca aggtgttcta aaacggcagc gagagctgtg cactcacttc 1860 

acaaatttga attcctgctc tgtgttaggc gctgtgctag g 1901 
<210> 182 
<211> 4550 
<212> DNA 

<213> Mus musculus 
<220> 

<221> exon 
<222> 2259. .2488 
<223> exonl 
<400> 182 

tacagctatc tgtgtgtatg ggtgc^tgcc atggaccacc tgtaga?ggtc acaggacaat 60 

ctccaattca ctttctcctt ccaccacatg ggctctgtaa tcactaaggt ttgtcaatcc 120 

tgagtacaga tgttcagaac catcttactg tctcctctct tctgataaaa catgaggtgg 180 

ccccagagac gttttagaca gggttataat ctgataaggg aaaagccaca tgtcctttcc 240 

ttacaaatgt aatttctaca gacattccta gaaaattgaa actttatggt tgggaaagga 3 00 

gagggggccc tcaggtacct tgtttttctg ttgacaaaag ttgactctta acattgtcaa 360 

gtaaatgctc ccacaaatgg atcatctgac tatttgcaga atgtcatagg ccaacagaga 42 0 

gagaacccct gaatttccag agaccttcag gttggctcag tcccttcttt tttgatgtgt 480 

acctcaattc ctgtcttcct gaactcttgt ttgccaatct gaatctacag tctatctgtc 540 

aaacaattcc tttgtctgga ctggtctgct gaactgacag tgaattgtct tgacagttcc 600 

tttgcctgcc cttttacctc tgcatcttca ttaaactgga cagtttgtca tatctgtgac 660 

ccaccaacag ctgcttttcc cctaaagctg ggtttgtggt tcatgttatc gtgacagaca 720 

ctcttatagc cctgtcagtt ctccagcact ggcttcccaa ggcttttaaa actcctttct 780 

tctttctaac tctttgtagt cactgtaacc tatatatgca tatgtaaaca gagatatact 840 

tacagagtga tgtatgtgtg atctgagagt taatattagt aattaagact gcaataaaag 900 

aacctgtgtt tcccttagca agggctacag agtaaagtgg gcctctctgg tgccagcgaa 960 

gccactgtac ttagtgaaat ttattgtcat tcaatacatt ctgatatcgt gtaaactcct 1020 

aagcacgtcc atctgacata gtgtgctaat gacaggagtc acctgtatgc cttatgaagc 1080 

gcatctcaga ggtgatggga aagaaacatg gggcaaaaga tgaagggaaa tccaaggcaa 1140 

ggaagcagag acacaggcgt cagtggtgtg gaaagggaga aaactagggg cagaataagt 1200 

gaccttaggg tcacttagag aaaccaacac acacacacac acccacatat ttaaaacgta 1260 

ctttatacag atctgagcgt gcgcactgac ctgtttcctt ctataccttc ttgtatagaa 1320 

ttatctggtc tccactagtt agggcagtga aaggacctgg gcccctggat aagtttttgc 1380 

tgttacttaa ctattctagt tttctggagg gaagagaact tatggatcct acatgtatag 1440 

ggaaatactt tcctacacat tgaaaagaag aaatgtagga tattaggaaa acgcacagta 1500 

gaaacaagtt aaagagcaag aggttattaa agggcaaaag ttaaggcttt gaaagattta 1560 

atacaaggag gtgacagtcc cgtgaaaggt gaaccaaggg tacaggagac ggacccagcc 1620 

tcattctgca acjagccaaga ggagggaagg tgtgcttcct atgcacgtgg gggcacgggt 1680 

ggccctccgg cacgcgaaga cgctgcagtt gtccataacc tgcggcatcg agctcctcct 1740 

gtgctccacg acttagtcgg ctcacgcgtg tcttgcagga agcatcctcg tgtctccacg 1800 

cagctctcgc acgccagcac aggccaaaac ccaccacctc acttcttccc gggctcatcc 1860 

ccagccagca ttcgcagtcg agcatgcgtc gtgacgaggc caagggaccg agccaatcag 1920 

aacacgtatt acgcccataa gtcggccaat caggaggcgc cttattaccc gggagccttg 1980 

cttcaccccg cctccccgct gacaagcacg ggtcgcgcgg agcaaagcga gcaccccgag 2040 

gcgagtgcgc ccggcaagcc gaggcgtgcc ctttccaagg cggcgagcag aggccgtcac 2100 

tgtccccgcc gggtcccggg cccccgcggc ccatgctggg ggcggagcca gggcggaggg 2160 

cggcggcgcg gccggccccg cgcagtgatt ggcgggcggc cggcggtggc tgaggtcctg 2220 

gtggccgcgc gggcaacgca ggcggagtcg cggctggcga gccgagagga tgctgctgtc 2280 

cctggtgctc cacacgtact ctatgcgcta cctgctcccc agcgtcctgt tgctgggctc 2340 

ggcgcccacc tacctgctgg cctggacgct gtggcgggtg ctctccgcgc tgatgcccgc 2400 

ccgcctgtac cagcgcgtgg acgaccggct ttactgcgtc taccagaaca tggtgctctt 2460 

cttcttcgag aactacaccg gggtccaggt gaggcgcggc cgcgcagggc tgcgtgcgag 2520 

ccctccccgc ggccggggcg gcgcttgcaa cccgggcgaa cactcgcagc ccggcgagca 2580 

cgtgccgcag ctcacggcct cccgccgcgg ggggaagttt ctggttctca cttcggggtt 2640 

ccttctggaa cgtcctgctg aggctgagtg tgttcccggg tccgccccac ccccgccccg 2700 

ggccggctgt tactgcccat ctcagtgcct gccaaagtag ggcactgagt ccgaggtggt 27 60 
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gatgctggga ctggcttcat ttgcacttcc gaggtctttt agattagcaa gacctctagg 2820 
cgctgaccaa agtgacagct gtgaaggacg actcctgcct tgggttcctc ccgggtgaaa 2880 
gcgagggcct agggaggaaa tgaatacatt ggttacaata ggagcctcac tgtcgataca 2940 
gttctcttca gcttggactg ggcttcaatg tgggctgatc tcttgtcaga ttgctttctt 3000 
cctgctactg tttctttctt tctttccanc cctccctccc cccccccccc cgccccgtgg 3060 
agattgaact ctgaaaacaa taaagagtag aaagctctcc taatgtgaat tcgttatatg 3120 
acatcccata aaaacctaca gttgtacttc ctttttggtt ttcagtttca aagaagagct 3180 
ctgtttgggt tctcccagat gtatctatga ctttcccccc catttctcag ttcttttcat 3240 
tctgtgttag gggggtactt tggcgactgg atcccttact gagttttgcg ccagttggag 3300 
attatgtctg aggtagggaa ttaagacctc tctgaatcac tatcttttta aatgttttcc 3360 
tagggaatag gaaaatcact gttgcacatc aaggtttctg aaaaattgac ttttagaata 3420 
ggatttcatt cahaattttt aggaaccccc acactgatgg tttcaaacct ccctcttact 3480 
ttactaagtt tgtcaagtga atgtatggtc taatcgtgga taagtattta atttcactag 3540 
cagaagggac aagacagcgg ggagcacaac ttaaagttgc tgaccttgca catgacaagt 3 600 
acccctcaga cgbtcaggga cctctactca agtgccacct atattcttgc tgcagagacg 3 660 
ttaggatgag tckgaatgaa gcaaagttag tgagtttatt gattgggaga gaggsfcacgc 3720 
acttgagggg agtcaagtgc aaaccttatt accccccacc caggctacag cagctgtttt 3780 
ctaagtgatt ttagggcttt taagttaacg ccttaaaact aagattaagg agaagagaag 3840 
gaaaaaaatg agttcttcta ttctttccaa taatgagctc taaaaaaaaa agaagcaaac 3900 
caggatctca cactgtagtc ttggtgggca ggaactctat gtagacctca caggcctcaa 3960 
gttcacagag atctgcctgc ctctgtctcc agagtgttag gactaaaggc atgtaccgcc 4020 
atgtctggat taaactcttt tagttatatg aaatttaaaa cggattcatg gcggtactga 4080 
acagtttaca tatgagggag aaatgtggtt aggcagtaat atggatcaaa ataaaatcaa 4140 
agtaattagc tgatcactgg tcacaagagt ttgagatgtg agcttgtctt ctgccttagg 4200 
tcaccagcta tagggataat cttttgtttg ttttttgtgg tttttgtttg tttgtttttt 4260 
tgtttttttg agacagggtt tctctgtgta gtcctggctg tcctggaact cactctgtag 4320 
gccaggctgg cctcgaactc agaactccac ctgcctctgc ctcccaagtg ctgggatgaa 4380 
aggcgtgcgc caccacttgc ctataatctt acttgtaatg gttttagaat atgtgcacag 4440 
tggagagcag tgttcaagca gctgtatcca accaattnca cttaaagagg gagagggtga 4500 
gggtgagggc ctccttttgc tattcaaaag cagattgtgt ggacattgca 4550 
<210> 183 
<211> 37950 
<212> DNA 
<213> Mus musculus 
<220> 

<221> exon 

<222> 5259. .5328 

<223> exon2 

<221> exon 

<222> 12675. .12791 

<223> exon3 

<221> exon 

<222> 14621. .14710 

<223> exon4 

<221> exon 

<222> 19822. .19912 

<223> exonS 

<221> exon 

<222> 21789. .21950 

<223> exon6 

<221> exon 

<222> 23387. .23510 

<223> exon7 

<221> exon 

<222> 25520. .26016 

<223> exonS 

<400> 183 

tggagtgaga ggcctgggta taattccttt ttctttgtca cactgtagca gttctgcttc 60 
tcagcctcag ttgagactgg aatacatttg tcatgctgtt ctgaagactt taatggttga 120 
tctttactgt caccttgact ggatttagaa tcgccttgga gatgtgctta tggtatgtct 180 
gggaggatgt ttbcagaaag tgtttactga aggtgtcctc atctgatagg gtggggccct 240 
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ggactcaata aaaaccggca tttatctgtt tcctgtccag ggacacagtg tggccagcta 300 

cttcacactc ttgctgcctt cctacatcga aagactgtat cttttcttaa aaggcgaggc 360 

aaaataaatc cccaccccca ggattttttt tggctgtcct ggaactcagg tccacctgca 420 

tctgcctcct ga^tgctggg tctacaggag tacccaccag gcctggctaa aagaaaccct 480 

tcttaaattg ctcctgtcag acactttgac acaacaataa gaaaaataat tcatacaaag 540 

accaagtgag atkatatgtt atttttttag ttcatagagt agtaggtaat ccatggtgag 600 
ggaaaaaaaa aaaaanaacc cttttgaaat aaggacagta ttgagaaaca tgtatgctga 660 

ttgtatgagt tttgtgatat gtaagtttct actcaattca catggtaatt gtgcattctg 720 

atcattaatc aaataattgt gtttactact ttaaccttct tacaaagtat agttttacta 780 
attagtaatt ta&taaattt attatttatt taatgaaaca tttcaaattg gactcagaaa 840 
aagaccacag tttttgtaca tattatagta gaagtccctt agtaggtgga aatcctgtgt 900 
ttctttacaa ggatatgtct agaacacgtt aaacaaacag gaggaggtgt ggctgcaccc 960 

gttaggctaa ccbgtcaaca tgccttttaa agccatacgt gttgtgtgtg agcatttttt 1020 

taaatatata gaaaatcccc aaaatagcta gtataataag cacacatgcc agtaagcctc 1080 

ttattactgt aaaatactgt gtaatacttt gtgcgttctt ttatgtgact gcagtaggtc 1140 

tgtttacatc agcatttacc catacaaggt ggctactgtc attcaggcct tggaaagttt 1200 

tcagcttctt tgaaacttgt ggggttactt ctccagttgt atgtgctgtc catggttggc 1260 

tagacatggc acatgactgg cgcatggacc taaagacaga ttaaagaatt catttcaaat 1320 

gatttcaagc acagagttta aattgtacat gcactcttag tcaatgtcct tgtagctgca 13 80 

ttttggtgta attggaggga catggactag ttagctgttt cttttacttg atgacagttt 1440 

tgaatgaaaa gcatgttaga tacaaaataa tttaactgtt gacccccccc cacacacaca 1500 

cacacactct ttttccctgt. atgccttcac tcctcgaatt ggtttttatc anatagctgt 1560 

tccaggtgta ccaggcatat atgaagtagt ttatgtagtt tcatttatgc cagtgcaatc 1620 

ctgtgaggag caattagtac attatacttt atagttaatg agatagatat agagaaagct 1680 

gatacatcac atctatttct tgtgtaagat ggagggctgg gattcaaatg taagtcttag 1740 

tccgtgtctt atactttgca gcctgtggtc ttagtgatga tcatgttatg aagggcaacc 1800 

tctttttcag actgtgtctg caagacccgt gaattaaatg accaaaggca tactaactgt 1860 

agagaaactg cccattttat tgatgctaat atttttacat ggtaggagga aacttggaag 1920 

aaatgagaag ccctactcag ggggattttt caggtgagat tatgtagtga ctagtgtaaa 1980 

agaagctact ataagggata ccaagtatgg gaaataagtg ctgcacacct cagggtggtc 2040 

acacagactc agkctgacag ctcaggtctt cctgctgaag aaggggacgt ctttgaaagt 2100 

gagaaattca tgtcttcttt atagaaagtt ctactccagc tctaggccga agactggggg 2160 

cagagagctt ttpttgtgcc tgcgatttct taactgtttt tattcaaaat cattcttatg 2220 

ccaaaagggc atatttgggg ttgagttctt tcagcgatat catttatatt tgagaccaat 2280 

gagacgtttt tctcactatg tattgtattt caactttcat gctaaaccta ctggacttta 2340 

ctgagataaa tckagaacat acctttaaac tttgtagttc ttctttgcca ctgtgaccaa 2400 

aacacatgac tgggagaaat ggtttatttt gcctcccagt cttggtgatt ttagtccctc 2460 

acagcagaga agbctggtag agcagctcac tgagtggcaa caggactgtg ctcatgtgat 2520 

catggaccag gaagtgtaga acaaagccag aactaggggc cctagtaacc tacttctgcc 2580 

agctaggccc cacttcctga aggttccacc tcctccctcc ccctcccaaa aaagaaaata 2640 

aaaatagtac caccaactgg acagcaaaca ccccaaacat aagccagtga ggagggaagg 2700 

agggaggggg gaggaaggag ggagagaggg gggaaggaga aggagagaga gaaggagaga 2760 

gagagggagg gagggagaga gagagagaga gagagagaga acacaccanc tctcaagtat 2820 

nnnncntnna nnncntggaa caaattaaaa actatgtaac cagaaaatta ttttaagagt 2880 

atttgatttg tctgtattta attaattaac tgaaaaaaaa aggaaaatta attcctttct 2940 

ctaaagactt taaatgaacc atttttttag tgtatgtgtg tgtgtgtcta ggtcaaaaga 3000 

caggtctcag gagttgatat ttgaccatgt gaaccctagg gatcgaactt ttagctcatc 3060 

agctttggca gcaggaatct ttatacactg catcatctca ctggccttag ataagttttg 3120 

aaaaaaagtg gaagaatcta aagttacttg gataattata taaaatataa gtctgaggtt 3180 

gggctcaaga cggaaagaca tcagtaagga gccaagagaa cagccccaga gagaatctga 3240 

gaatcaaggg tgtgggcaaa catacagtga tctacacttt ttatctgtaa atagtttgta 33 00 

aattcttaac cctttaaaaa aattccctaa cccctactct gcagatcaga ctgccctgga 33 60 

actcattgta gaagcctaag gtggcctcgc attcacacaa tccttctgac acagctcccc 3420 
aagtgctagg attacaagga taagctactg tgtctgcctt cttagctttt taaattttaa 3480 
aagcactaca acpttgtaac tagttttata cttgtcaata aaataacagt aactgggact 3540 
ggggagatgg ctpaggctgc ttttccagag ggacccaggt tcgattccca ggagccacac 3600 
aatggcttac aajtcatctgt agctatagtt cccagggatc tgatgccctt ttctgattcc 3660 
tgcaggcccc agigcaagcat ggggtgtaca aacacacatg caggcaaaac agatgtcttt 3720 
ctgcattgct cttcaccttt tatattgagg caaggtctct cacttgaacc cagatctcac 3780 
tgattggcta gtgtaactaa ctggcttgct caaggaatcc ctgtctacac ttcactaaag 3840 
ctcttgctgt gtiagcccagg ctagcctcaa attcctaatc ctcctgcttt agccacttaa 3900 
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gtgtccggga ttkcaggcat gcaccactac atctggttga aaggtatctt tggatggttt 3960 

gggtattatt agcttggagg ctagcgatgt ctccaaagaa ccttacataa tttacatatg 4020 

catacataca tabatacata aacatacata catacacaca tgcatacatt taactggatc 4080 

tgtagttctt gtaaagtgta cccagtggtc cttaaatagt tctactttat ttctaagagt 4140 

gatcatgatc atgagctggc atcccaatat taactctgca cagatcaact aaccattggt 4200 

tacttatttg tttgatttat gtcgtttgta aacttgatta gaagtaatta gacacgtgtg 4260 

aacacactgt gcagttgtct tagagcagta gcagcagcct tagtgtgagt ccttgacatc 4320 

taactttata ttatgtagct acattgcttt tacacagggc ttattaatgc cctttcaggt 4380 

tttgttatca ttatggtatt tgtgagtatt tcaccaccag gaaggcaaac ctttctggca 4440 

tagtagactc caaagcgttc actaacttat gcataattct tagtgacaaa gtataagaac 4500 

aactgctaag taaaagtgat ccaaaggaag ggaacagatg agcaacagca gtcatttgct 4560 

cccttcaagt ccctttctaa tcactggata cttaacactt acaaaacaga acagtacagt 4620 

gaaaacctgg cctcgtccct tggtttaaaa acagagtgta gccaggaggg tacagtggca 4680 

catgatgtag tgaccctgac ttgatgttcc agctcacagg gacctctggc ctccctgttg 4740 

tactcccctg gctccttagc ctgcattttt gttcagcaca caagtggact aacacaagac 4800 

tccagcacac agctttcatt tggtcttatg ct'dagaaacc ctttcctctc atttgctaaa 4860 

atatatccaa cggacacagt agctctccat tttacaaatc cttaaaatat agagtatgtt 4920 

cataatattt gcgttatttc ccattatgtt attatgtata tgtccatctc ttccgctgga 4980 

ctgtagagcc ttatatatta ttagtgatca ataacttctt gtggattttt cttacaggga 5040 

tatcaatcat tctgtcttac tttatgatca ttaacctgtc ctacatttaa gtactttata 5100 

agtatgagct atttatagtg tgtggggctc ccaagagaac tctgatgttt gttagtcatg 5160 

gatagagcta gttacattgt gtccttgtct gtttcctttc acattctttt ttttttttta 5220 

attgtcaaag tcbtgattct ttttgttttc tcttttagat attgctatat ggagatttgc 5280 

caaaaaataa agaaaatgta atatatctag cgaatcatca aagcacaggt ttgtatttca 5340 

tttgatgaaa tttgggtttt tctagaaatg gtaaatgagc attaatatgt acacacacat 5400 

acacacaaac acacatatgt acacacacat atgttttaaa gacaggattt catgtgaccc 5460 

agaatggcct catactctct gagtagctga gaatgatttt aagcttgtga cacacctgcc 5520 

ttcatctcca aggtacagga attgcaggtg ctttctttgt atgcttagct tatgtggtgc 5580 

tgggcatcaa acccaaggct tcatgcaaac taggcaagca ctgtcccatc tgagctacat 5640 

ccccagccca tcaaaatgat attttaaggt tatttattta atgagtttta tcacgtgtgt 5700 

gtgtatctgt ctgtgggttt gagcttgtga gtacacatgc ctgtggaggc cagaagaaga 5760 

acaccagatc ttccctggag cttgagttgc aggtagcgat tagttatcct ggatggattt 5820 

ggggaactaa actggggtgc tttgaaagag aaatatgtac tcttaactgc tgggctttgt 5880 

ctccagcctt taaaatatta atcttatata tttaagtaaa ctaagctagc ttttgttttt 5940 

aacataaatt tgctgtggat tttgaatctg gcttgcaatt ttattttact tttttggtgg 6000 

ggagggtaag gttagtaatg tgaaaggtat ggtttttgtc caggcttctc ttcttcccta 6060 

tttctcaaaa taacccttta gtttatttgg ttttctgtct ctacgttata ttttctagaa 6120 

tatatatata tatatacaca cacacacaca cacacatata tatacatata tatatacaca 6180 

cacacatata tagtgggatt ggtggaccca agtgtataca ttttaattct taattctaca 6240 

tctagcnttt tatttactct gagacatatt cttgctctgt tgcccaggct ggactcaaac 63 00 

tcaaaattct cctgcctccg tctctggagt gatggatcac agttgtatgc tgccactgtt 63 60 

tgcactctaa ctgtgtagtt gttaagctgc ttattggtat gctggtgtct gactgctcat 6420 

ttcctggaca cagtgctgtt ataattagct gtagttcctg ttgctcttta atcctggtag 6480 

cttaattcta acttgctatt tttccgtcag agaaggcaca agactagttt ccagtataga 6540 

actgtattta cttccaatca ggtcaaatat atatatattt gtttgtttgt ttgtttgttt 6600 

tgttttgttt tgtttgtttg tttttgagac agggtttctc tgtatagccc tggctgtcct 6660 

ggaactcact ctgtagacca gactgacctc gaactcagaa atccacctgc ctctgcctcc 6720 

tgagtgctgg gattaaaggc gtgcgccacc atgcccggcg ggtcaaatat ttaaaactac 6780 

ctctttttga at^attattg atgaattgtg aattatgctg ctagtcctgc ggtcccttca 6840 

tgagggtcct ccccatgagg cacctcacag gtacagggcc cagccaagag gccaaatgat 6900 

tgcctaaagt gt^ctttcat tctccaaata attcacctca tccaaaatcc cacttcctat 6960 

attttctaaa ccacggccac gttgtgtgca tccgttctat ggttcgtttg attgctttga 7020 

caagctcagg acctagcatc ccaaagctaa cagaacccaa ccgttaggat attctgaatg 7080 

gaaggtcact tgtgctgatg gcttttattt tcctcccacc cccatttggt tgggattgta 7140 

gctcttgata ccgcaccccc aacacacaca cagacacccg gaccatagct aacacagagg 7200 

taaaggagct gaaacacgtt tttctctgtg tgacacactg gaaaactgat gaagctccaa 7260 

agcttgatag ggattttgat atcagatgta ttatcctggg tcatgttcgt gagactggct 7320 

cagactctgc atttgacttt gagtcgaacc acagagtggc ccagagtgac tctctgcttc 7380 

tagccagcct gtctgacaca ctggctggcc ctcctcacca caatctgacc tcatgctggt 7440 

gaggggaatt tttacctggc tggcccccag gacagggcat gggccatggc agcatcctgt 7500 

cgcagttctg ttgttcagag gtggatccca ctgcaggaaa ctagagtttc ccatcaatgt 7560 



wo 99/32644 



PCT/IB98/02133 



120 



ctttcttctc agttttatgg aaataacctt tccttaatgg 
aggtgggacc atgtactctt gctctcctgt gggtctcaca 
atggctgtgg ttgcctttgc cttcctttag agcaggcttc 
gcagcaggta acagttccta cttggtgttt gagatctgaa 
gctaagccag ttcctttctt agtttagaga gacatttctt 
tcctctaggt ctgataacaa ccagcctttg gggtttgaga 
tggcaacttg aaaaaagtac ccagagcctt tgcatgttaa 
acaagcaact gtgcaacaca cacagacaca agcaagcagc 
tccaaatgcc tttaagaagc ccacaagttg ggaacatacc 
ttagaattgc atgttagtct acgtgatgga agaaggcatg 
gctcatggaa gactttgcat gaattccagg cttccagagg 
gtggatttgg ttcctgggga tgacatattt aggtaaagag 
aaagtgtaat .ttacagaaag tgttctggga ctgctgttgg 
tctgttcact ctggcctgat gctcatgaag agttgacctt 
cttttacttg gtatcctgct gtgtctacat ccttgctgct 
cactgttcgc ctttaaaatt cattc'ctttt catttcctgt 
gtagatggct gtaaagctga gggctctgca ttctgggaga 
tctgtgtgtt ctctgccagt gggcagagca cccaggctct 
tgaatgatat ctattgatag aagcttnang nnnnnnnnna 
agtgcagtgc ctggcatatg ggtgacattt agcagtactg 
ccagtgctgg attcatgggc ataaaccctt atgtgtcatc 
gagacacgac tgtacagacg aggaaagtaa ggtttatttg 
ttactgaaaa catggcttat ggctagataa gcctcatggc 
ctgacaacca aattaaatta ttcttgccaa aacacaaaat 
aatacattcc cttacatatt ttaatccagc gttatctcgg 
gnacagtatt gaatctatat tgtattcttt gactgtactt 
aggtcttacc atataacccc agctggcctt gaacttacca 
aactcacaga gatttgcctg cctcagcctc ctccaactct 
ccaccattcc tggcttgtaa gttacttctt aagtgttgtt 
ggttatggtg taggatagtt tatcttggat gcatgacata 
ctcttaatag tttttagaag gtataccgga ctccaaaata 
aaaacagatt tattgttcca taagcttaaa aattaaagtc 
tggcttttac ctatttatct catttatacc cagaatattc 
gagcagtgat tctcaacctg tgggtcacaa cctatcctgc 
gaattgtaac agcagcaaaa tcacagttac gcaatatcaa 
gggtcaccat aacgtgagga actgcattaa agggtcacag 
atagccacat agagccttta cagggttcat tctcgttgtt 
tagataactt tctcacagac ttggttatat ttccaagaga 
ctctaaaaca attaagattt ttctagaaag ttgattattc 
ctaggatatt tcatttgtat atgcattatg aaaaaaattt 
atagctgtgg aaaagtgccc cattttaaca cactttgaac 
ttgtttgttg ttcctccccc gccccatccc caactttctt 



gggccatgca cgtgattaga atatactcct tcactgagct 
ttcttttatt ttatttttta ttttgcgaca ttgtctcact 
gaactcatga tccttctgct tcagtctccg aagtagctgg 
agcctggatg taagagtttg ttgttgattt aaattagata 
catttgttgt cattttcatg gccctgagta cacttcaaca 
gaattcttct ctaaatccta gcagaccttt cctgatcttt 
ggtacctgca gccatttatt ctcaagtctt caaatattct 
atttctttgt atttcaccat agaagttttg catgacctca 
tcaaacactg actgataata aagagtggag agatttttat 
ttgtttgttt gtttggttgt ttggttggtt ggttggttgg 
ctgtgtagcc ctggctatcc tggaactcac tctgtagacc 
ttatttttaa atacatagaa ttttagcagt tattaaagat 
tgagatggat aggtttgtat agaagaactt gactttggct 
tgtagagcat ttccttgttt ttcagaattc atcaggattt 
ctaagaatgt tgtttccaca aacacattgt acaatatggc 
ttcagaaatt gtttgaaact ccattctaat tctaggtcaa 
agccagtttt tataaatcaa gcattctaat gtaatacaat 
acttacatgc taaggaatgg cactgatgaa atattcacct 
gctctatgta tacgaaatgt acttcactta atggcacggt 



aactgtaatc 
gtaacagggc 
caggaaagca 
gataacgtgg 
gactctgcct 
acattcttgt 
gcaagaactc 
tcaacctaag 
agaggaatga 
gagtcttctc 
gttcaaacca 
agcttaagaa 
ttcagcctgt 
taccttagtc 
tctaacttgt 
ctgata^agt 
ccctgctcat 
ctgagcctgc 
nnnnntncnn 
tgaatgatgg 
ccataacact 
cctgttctcc 
tagatcactg 
acagatatcc 
aagcagtgtg 
attttttttt 
tatagaccag 
gagattaaag 
actaaaattt 
tttaaaaata 
taactgctta 
tcaggaaaag 
actggccagc 
ttatcagata 
caaaataatt 
atttaggcag 
tctatagaaa 
tagctgtttt 
acgtgtaaag 
aaatggtcaa 
tccaggcttt 
tcatgctagg 
gcaccccccc 
aagttacctg 
gattagaggc 
ttgtctcctc 
gcatccctgt 
cattttctgt 
gcttctcagg 
ggcatataag 
atttttttgt 
tttttcgaaa 
atgctggtct 
aaaaggcagt 
gaatatttga 
tgattttgta 
agaattgtgt 
aattcatttc 
caaaagtgca 
actttctgta 
attacatata 



ctacaaggac 
ggctgaaggc 
ctgtgagtaa 
caagcaagag 
ccaggtgacc 
tttttgtttt 
ttctgcctag 
catgctgcat 
ctgaactgcg 
taataaaact 
gcaacccaag 
tcagtctctg 
tcactctcac 
tcagttgctg 
ctgctttcaa 
atatggttca 
gtgtgctgct 
tggttgggtc 
nnnaaaancc 
tggggatagc 
aagaattggt 
tctcttcaag 
ctaaagagtc 
atgtgaatgt 
cataagtaga 
atttgggata 
tctggccttg 
gccaaacttg 
ttaaatttaa 
tttatatatt 
tttaaatata 
ataccaaact 
aaactctgta 
gttacattat 
ttatggttga 
gttgagagct 
acgtttatat 
ataatcccta 
agataaaatt 
gaattatgcg 
atactgcagt 
acagaaccca 
ccccccccag 
ggtaggcctt 
ctgtgctgtc 
taattaactc 
tcatatcctt 
tcacatggaa 
acactttctt 
aacaatataa 
ttttttgttt 
cagggtttct 
cgagtgagat 
ctacatactg 
tactatagga 
gatgccagtg 
ttagtgtcat 
atggaactca 
ctagttttgt 
acagcagaaa 
tgctagcatg 



7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520- 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9560 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
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tgcagtgaga agcacgcatg ttgcatactc aaaacagaag acgcaggggc agctgcacaa 11280 

ggcagcggtg ggcagagaca cttattcatt catatgtatg tggttttgaa ttaagttttg 11340 

ctcattgctt atttaaaaac ttttattcac aatttttttg accactaaaa tcagtttgca 11400 

acccacagtt tcaaaagctg caaaataaga cctacatatc tacctcgcac aattgtaaat 11460 

cacacgagac ccttgtttgg gtattgtaag aattgaacac tgtatccagg aagtcattag 11520 

taaaaaccta atgtggtgcc ttgcttttta aagatttatt tacttattta atatatatgc 11580 

cctatcatat atatacctgc atgctagaag agggcatcag atgtcagtag agatggttgt 11640 

gagccacgat gtggttgctg ggaattgaac gcaggacctc tggaagagca gccagtgctc 11700 

ttaaccactg agccatctct ccagctctta atgtagtgcc ttttgtcaga cattgtgtat 117 60 

atgggatatt tagctacagt tgtttcactt gccatttttt tccttataat tttccattct 11820 

tcatttaaaa agaaatatct cttatttttt ttacctgtaa ttaaatatta ttcaacagtt 11880 

atttagtatt tgggtgttgg gttacttagt attttgtagc ttttaaacat ttgttctttc 11940 

ttttcctgag tatgtttgag tccctgcata tatgcctgtg cactgtgcat gtgcctggtg 12000 

cacttggggc tcatggaggg cctcatatcc gttggaactg gagttagagg cagagctggc 12060 

ttatgggtac cagacctggg ccctctgcga aagcagcaac tgaatcctta accactgaac 12120 
aaaatctctt cagccccatg tattttgtac ctttgtgttt* tatccttgaa ataaatggcc 12180 

ttttaagaaa tgagaaaagc ctttaatccc agcagaggta agtggatcac tgagttcaag 12240 

gccagcttgt ccacatagtt ccaggagagc cagggctaca cagaggaaaa aaaaaatnca 12300 

aaaaacagga aaaacacaca cctccttgat ttaagggttt tttgtttgtt ggttgttttt 12360 

tttttttgag ttttggggag ggggtatatt ttttaatgtg tctgtagttg gctttgtttt 12420 

aagcatttta atcatacttt atttttaaaa aaactaaaag cttttttaag gctaggtctt 12480 

gctatgtggc cctagtgttc ctgggacttg ctctgtacac cgggttgact ctgagcctgt 12540 

gcgccttctg cctctgcctc catagttaga ttctcaggac atgttacaaa gactgtgctg 12600 

tgaagatgag tttttgttcc tgggagggaa ggttggagct gacttgtgag gtactgactt 12660 

gggtctgcct tacagttgac. tggattgttg cggacatgct ggctgccaga caggatgccc 12720 

taggacatgt gcgctacgta ctgaaagaca agttaaaatg gcttccgctg tatgggttct 12780 

actttgctca ggtaaacttt gtctttgccc ttttatttca aacttaacac catttaatga 12840 

aactatatct gatttttttg tttatgtgtt tgttttatgg tacccgtgat tgaacatggg 12900 

gtcatatgtg tgctactgag tgacagcctt agttcagaca ttttttaaag cgacttttac 12960 

tagtattttt atttagaatt ctatatgtgt gcacatgcat atgtgtgctt gtgtgcacac 13020 

gtggatgcat gtgaggtcga aggacaattt tcagtacaag tgtgagtgtc actttttagg 13080 

caccttccac tcttattttg agacagtctc ctagaccttt gctgagttgc ccaggctagc 13140 

cggccagtga gccctgggca tctaccggtc tctgcctcct tacctttact taggttacaa 13200 

gtgtgtgctg ctacgcccag ctgtttacta gattctaggg atccaaatgt gggtcctcgt 13260 

aacttgtgag acaagtactt tccaaactga gccacctccc tagctcttct tcacggttcc 13320 

tgatggtgtg tgtctagatg gctggttgtc cgtatattta agtccagtag cagaaataca 13380 

aatacctagg agtccaatag aaagctacaa gtgcagaatt gacaatcggt aatgttcgga 13440 

aattgattca aaagtagtta gtgagtgaca gacaggagct aaaagcagac tctgagctca 13500 

gagtgtgaag tgtggagaaa tgtgttttct cacagttctg aaggctgaaa gtctccccaa 13560 

ggtcaggatg tgggtggtac tgctgtctcc caacacccac ctctttggat tatagactgc 13 620 

agccttctcc ct^tgttctg agccggcctt tcccacatgt ggacatcctt ggtgggtgtt 13 680 

ccaccagcag ggcctcagct agtgccctta tttcacttaa ctgtaatgat tttcttaaag 13740 

accctgtctc catacacagt cactgtggaa gctgaagctt caatgtaaga gttaaggggg 13800 

gagggggaaa tttagtccat aatggtgtca caccaatctc tgtagctgag tccatgattc 13 860 

agttctttaa aggctctgag tgtagacatt atcttaatta ttttgcccat ttatgtatta 13920 

tctttaattt attttatgta actgaatgcc tgtgtatata tgtttctggt tcctagtcca 13980 

tattttaatt ccttaaagga tggaggtgta gacttttgtc tttttaattt tctatccttc 14040 

cctcctggcc tcctgtggcc tcttttacgt atttattatt tttaatttat tttatgtgtt 14100 

tgagtgtttt gtatctatgc attcctgggg cccatggagg tcagaaaaac acattaggtg 14160 

gcctacaact gagttatggg tggtttgtga ccatggggtg ctgggacttg atcaccagtc 14220 

tctgagaaga ctcgtgtctg ctgagccttc tctccagtcc tgggagtgtg gatattttaa 14280 

ggatactttt aattgacttg gtgaatgaca gtagaaaatc aatgagttag gatccatcgg 14340 

aaaaagcttt tgaactaaat cttttaaaga gaaaatattt taagtgctaa caaaattaaa 14400 

tgtgtatttt ccatgatgca gttttacttg ggctctgtag aaataggatt ttcaggtaca 14460 

tattgtatat atagttggca atatttaaat actaactgtc gcttgagttc tgaaatgtag 14520 

ttttatgttt tttactcatt aggagtacag ttgccttaat aactacggag attagttatt 14580 
aaagaataat tgctcttctt ttttcttttc tgtgtaccag catggaggaa tttatgtaaa 14640 
acgaagtgcc aaatttaatg ataaagaaat gagaagcaag ctgcagagct atgtgaacgc 14700 
aggaacaccg gtaagtgcgc ccgcttttat tcctcaaggc aggttaagaa gttaagttct 14760 
taagtcattt tgaaaatata ttaccccatg tggagcaatg gaactggttc ggggttttgt 14820 
tgagataagc tgtcctctgg ccgtgaggta agattgctgc aggtgattgt aaggtttctc 14880 
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ctgagtaaca gtcagcatgg gctcgggacg ggcaagggca ggccttagtg tgcagaggat 14940 

ggagctcact ga^gccccaa agagttagtc ttcacatgag attcagttct agaagaagtt 15000 

aaattgcttt ctttctgtgt aaatttggat ttttattgta gaaattaaag tttgttttct 15060 

tttaaaacaa acacaaaccc agagcaaaga gtctcctagt gaagagtcat tccgtgtcag 15120 

tattttacac aajstgttttt ctgtaaaggg ggaaaaagaa ttcaaatctt ctctttcaag 15180 

aatgctgact gctgccaact gcctctcccc gtggcccctc tctgtataga caggcatagc 15240 

tatggtgagg acttgggcgg ctcttgtctt tctcctctct ctgcttctct accctttctc 15300 

tcgtgccctc cabttaccag gccctgggaa gctacacacc aggcaacagt gaccagggcc 15360 

tcggcctggg ctfccgaccaa ttactagagc agaaacagca gcagctgcag tgttgttttg 15420 

tgctgtgcac tgtattaggt tgtgttttca tcacctttgg gttttgtgat gttttgatga 15480 

agtcctggta ccattctagt ttttacattc tgggtagata gagtttattc aaggtctcaa 15540 

ggcatatgaa tggaagagct cctctttaca gccattcgtg tagcatgcat aactgctctt 15600 

ctgtattctc tctagtgtct ttttttttgt gtgtgaatct gatgtcttgt tattcaccta 15660 

caatgtggag taatggtcat aaacatataa agtacttatg cctttatctg ccaaattgta 15720 

tttaactttt cagcttttaa tataactttt tatataataa ttaatttatt ttaaaaaaaa 15780 

ttgaatacca gcctgttata gtggcatatg cctgtgttcc tagcactcag gagacaaagg 15840 

cagaagtgtg agaacttcag actcatactc agctatatac aagaccccaa atttgtgcta 15900 

gattctgcag tacagccatg agtgtcccca tcttagaggg agatcgctca tccttgtgct 15960 

gttctttaag tcttaccctg caacccactg taagtacact cttgctcaca gtcctttaga 1602O 

atctcacact ctttctcttt acagacacca tgtcattgcc cactttatta tttatctgat 16080 

gtctacaaag attatgaaag agaaacttgt atgcattctg tgtaaagtac ttgacacaaa 16140 

taatagtatt caagaatgac ttcttaaatg aacactgaat gaatagtttg ttctaatttt 16200 

tttgatcaac aaatcaaaaa atatttagat taaatatcta agatacaaag cataatacca 16260 

catgaatcat taaagtgagt aatcaatctt ataagtgact gaccctaaaa ctcatagaca 16320 

ttaataattg ctttcattgc ttagatataa actttattga ttaatacgtt ctcatgaaag 16380 

tggttcttgg aaggttctgg aaacgaaaat atttttctta ctgctttttt cttctagtaa 16440 

ctgattgaat ttttctgcag ttccataaag catctggtca attgctatta tccaatatga 16500 

ggatatataa ca^agtattg atttttaaat ttggcggtga taagacaaga ctgggcgtgt 16560 

gaatgagggg gtbtctgttt cttgtccctt ctcttgggtt cttttccttt tgttggtttg 16620 

ccttctccag ctgctatgtg atgggttctg attatcttat tatatcttat tttgttattt 16680 

ttcattgtta tcbcttagaa gccaacatgt tatatcacct ccactcccac cattaggtgt 16740 

ctcacaaata ccccaagcta aacaaccaca tcatgtcatg ttctgtatac ttccataagt 16800 

gttgttaact ctkctgactc ttgtgagcag gcccaattgg ctttatccct agctgggtga 16860 

cctgggttcc tcbccaacac cataccgtcc atcaaactga gtccttttcc aagcacacac 16920 

cagatactgc tcatctgagg actcttctca tccacctaag gactgcctgc tcctcggcag 16980 

aaagggcctc tagtcccata cccttacgcc ctcaccaatg ccttaggaac atgtgctcaa 17040 

tgcccctgtg ggtcatttcc gtttacagta gggaaatttg cctgataact tgcagcacac 17100 

ctataaagag gccttgcttg ctctcatatt tagctggaga agataatgta ctcaccaact 17160 

ccactctatg caacccagtc tgctctgccc atgccagtca gacgtgaatc ttacacctgg 17220 

attcagattg atgaatctac aacatcaccc actccatgct tccttctaaa tcagcagttc 17280 

tagcctgaat gacagatgct acccaagtct catctagtta gccctgtccg gagtaaccct 17340 

gaccttgagg attagaccag gatgcacatc ctgcaccagt tccctttgtc cacctgactt 17400 

catcccaccc gggccatagc ccatgctcag gctccaccct ccatgcacaa agctggcttt 17460 

tccagcttcc ttcacctgta tcagacacaa atagcaaaag gggtccacgt gcctaggtcc 17520 

catcacaaga ccatgtgcgg tagtttggaa aacagtctcc acttgaggct cagatagntt 17580 

ggaatcttgg ctctcatgta gttgtactga ttagatcagt ttaggaagta tgacctttnt 17640 

ggaaagaaca tataactggg aggggctttg agatgtaaag gccccacaca attcctagtt 17700 

caaactntac tgcctgctca aagcttgagg catgaactct cactgttcct gatgtcatgg 17760 

ttcctgtctg cttccacaat tccctatcat taggggtccc tttccttcct ggaattttaa 17820 

gataaataaa cacttctttt aaaacaacaa g;aacaacaaa tctgacnctg ataatggatt 17880 

ttaaggcgtc ttctctggat aagaaaaaaa aaagaatata tttgcatagg tgctgtatta 17940 

cttttgtcat tggtataacc tgactggaag caacttaaag gaagaagaat gtatcttgat 18000 

ttgtagatta agagcaccat gactaagaag gcatagcagc acaggtgcac cagcaagaac 18060 

ataggctgct aglctcagatc tctgtagata tgggaacagg gcaggaagct agtagtctat 18120 

aaacctcagg acbcatccca tggagttcct tgtcttccag tgatgtcctg tgtcttaaag 18180 

tttcacagtt ccjcacagcag cacctgccgt ctgggaacca acctgtggtg gatattttac 18240 

aacgtgatag gclatattttg tctctagccc tgtaggttta tagccatcct atacttcagt 18300 

ttatctagtc cacctcagtc tgatggtctt atagttccaa cacttcaaaa ctacaaagtc 18360 

ttaagggcca tgjggctcggg tttattagag cagtaacacc tctactagct ttctgtgtta 18420 

cccactcctc tt^aggtctg gttgaaatcc taataggaag cagcttgaga ggagggttta 18480 

ttgtggccca tabtttgttg gtacattcta tcatgcaagg gtggcactgt gatacagccg 18540 
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aggccatccg aggatggtac tgttggctta catctgggtg ggacaggaaa tggtaattct 18600 
caaggcccac ct&cttggtg acttctttca gttaagcccc atactctaaa tcctctacaa 18660 
cctcccaaca ta^tgccacc agctggggat cagctgttga cagtgctggc ccaggggagc 18720 
agtttaaatc cagaccaggg gacctgaaaa cagagaactg cagaggggct gtgggacttt 18780 
ataccagctt tgcagacaaa tcacggcatt tctttgtgag cttggttcat aaacaaatat 18840 
atattctcct ataggctcct ttagtgggtg tttcatatcc acaaatttgt tcagaaaaac 18900 
actgtgtttt atgctagctg tgtaggagat aataccgctg ggagtcactt gagcatggat 18960 
aagtgacata gttcgtcctc atgagtccct gtcctgtttc tgtattatgt ttacttgatg 19020 
agtttagttt gtcagttggc caccaattaa aaagtatcat tttatttttt ttacaatact 19080 
cagttctcaa gttaggagtt ttgttattat atggcttcaa tattcacatt ttaacctttc 19140 
caggagttaa gtataaaaac ttatatcaac tgttgactta gtaaatatct attacagata 19200 
ctatattctt cttagtttat atcatgaata tgaggttgct taaagtaagt gatgtaaaat 19260 
acactagggg atgcttataa aatggaatgt tgtgagtttt ttgaaacacg agtactaaat 19320 
tcataagttt ttaaatagtt acactgttag cttcagtact gctagataca tgtctataat 19380 
ggctgaagag tggagcttgg atattataag tgtactctgt atattcatgc agacatatag 19440 
ca^attccac tagtatgtgt ggttaatatg tgctaataaa aatftaatac aaaagtcatg 19500 
ttttattact gggaaccaga ggggttggtt gtgctgattt taagtcagtg actattagca 19560 
tattctaaga aacagtttta ggattttaaa gattggcttt accataaatg tagagctatg 19620 
ttttactata atccatatta tggtcggcct taattcaatc tctgcagttt ggttactctg 19680 
ctcaaagtga aggtcattta taaatgatac acattttctc accataggaa atactacctg 19740 
gccaataaca gagttagaat tgctaaattg atggtaccaa caatggactc aacacaaact 19800 
aaagtttatt tabgcccaca gatgtatctt gtgattttcc cagagggaac aaggtataat 19860 
gcaacataca caaaactcct ttcagccagt caggcatttg ctgctcagcg gggtaagtaa 19920 
agatttaact gtattcagaa aaacactttt ttaagaagag tgatctttgt ttccttcaga 19980 
gtcatactaa agkatatgcg tttcttgtaa gagctaagtg agagaatatc cgatcttcta 20040 
cagagttagg tatattctta ttagtctgtg tctgagaggt tagagacgca ggcttgctat 20100 
ggcacatttc cc^tgctgtg aattgagtta aaaatgtagg taaatgatat ccccaagaaa 20160 
gtatactttt gg^gtgactc agtataaagc ctggtgttat aacataaaca cgcacgtgcg 2022 0 
gatgtatatg tagcacatat gtaaacacag gtatatgcat tgtaataaga aagtggaggt 20280 
cggggcccac tgpaggcaag tcttttagtg atgctgagct aatgctgaga ggtagaaaga 20340 
ccaagaaggc tggagttgct cattcggcaa aggtcagagc tcactgtgtg ccataactcg 20400 
agtgttctgt ctcccttttg atacagtttt cttgttttta attattagtt tttacaatta 20460 
tcccataaaa tgtgggctca ttgtggtcat cgttttcata aagtccttca agtatacacc 20520 
cagcaagtat ctaaatacac tgggaagaat cagtcagctg atggcttgaa gtttcaggac 20580 
atctagtgcc acatcatgct tcagaaccga cctgcactta gtcagggtca tattcatgcc 20640 
acgtgaagac gagaggaggc catgccgtct gacttaggat ggaaatttcc ttcgagcaaa 20700 
cacgaacggg ctaggtctta gttataggca tagtgtctgt ggttatacta ggcagacatt 20760 
agtggactgg gtgttagaag gtacagacag gcaagaattt gctgtagatt tgtttccctc 20820 
atgtgttgac accacatcta acctgctttt tgagcttcta gtcctaataa tctcataaaa 20880 
atactggttg aaccagaaat ggtgttgcaa agctatgatc ccagctcctg ggatctaggg 20940 
tgggaggatc ataaatttga ggccagcttg ggtctgtctt agagaaaaaa gaaaaataaa 21000 
aagtctggtc aaggtaacat ggagcctgga agtttcacag ggtgattctg taaaggtcct 21060 
gagacaagat ggcctctagt ggcgaatgac ttagctgaca agaaaacttt cccagcttgg 21120 
ttgacttttc agacttcata caagtttgtg aataaattac actccttctg cccttgggac 21180 
tgaactcaga tatgtggttg tgggaatggc tttctttccc acaccaccct gcattttaaa 21240 
aattcttctg tagacagtcc caccatcctg tagctgttct tccttatgtc gccactttcc 21300 
ctggagagag gcjagtgcaga cttcaacccg cttctcccta gtcgctgttc atagcacatc 21360 
gaaagaccta gt^cttcctg tgaaattgta agtacatcct ggagtccagg agaggaggaa 21420 
gccgaacaga gtggagggaa tgctgagttc tgtcctaaga aagactgcgt gcttagcaag 21480 
atgctgctgc tctcctgtcg tgtctttctt gtcagaactt atcaaagaga aggctcgcag 21540 
tgggtcataa tcttcccaag gaccagcctt cccagcttct cgcagcatat ctcattcatg 21600 
tagatgttta atggatatgt gtcaatgggg ttgacctaag tgagatggca atgtatgtga 21660 
gcattctagg tgtgaggtta tggcattaaa ctttaatttc cgtctatttg tggtagttga 21720 
taagtaattt agatgttgac tttcatgtat tcctaattat gaccacattg aatctacctg 21780 
ctttctaggc cttgcagtat taaaacacgt actgacacca agaataaagg ccactcacgt 21840 
tgcttttgat tctatgaaga gtcatttaga tgcaatttat gatgtcacag tggtttatga 21900 
agggaatgag aaaggttcag gaaaatactc aaatccacca tccatgactg gtaagtccgt 21960 
atttccatag aagctgaata gtacatggta caggtaagat aaactcttgt ttgttcgctt 22020 
tgcttagctt ggttcagttt ggttttcagt agagggttcc actatgaagc tctggctggc 22080 
cgggaactca ctatgtagac caagctggcc ttgggctcca ctacacccag caccaatcac 22140 
ccactcttat cttttatgct ttttgttttt gctttgagct ttctttataa catgtttggg 22200 
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aaggacattg tcattattta caagaagaaa tatggtcttt tcccaacatg ctagaattta 22260 
aagactcaga actcttgcct ttgtcagtga caaagtgaga atggctgtga agtgacgtgg 22320 
ctttgagtga gaatagttca ggtaactata gccacagact caacatttga acatgggaac 22380 
aggtgagaac ggagtgatgg aagattctgg cccctttcag agaattcatt ttagagagag 22440 
atgagagtag taaggaagag agaagagaga gacgtggtat tttgctgcag actaaagaga 22500 
tctcttataa tcgcagtact aaggaggaag aagcagaaga tgatgactac agggccaggc 22560 
tgaacaatct agtaaaatcc taagtcagga agtcagggct gaggtgcagc tcagtaggag 22620 
agttgttgtc tgccctacac aaggcctgga tttagctccc agtagcaacg aagggaggcg 22680 
agggtgggca aaatcgaaca cttactcttg gagactccct ttatgaatat taccacactc 22740 
cagtaaatac tctccagaga tttcagatga gattctgctt cctggtaaac aggaggccaa 22800 
gaatattatg tcacactgaa catgggatgg aagacatgtt ctgaggaatg tctgcactcc 22860 
agtgtgatga agacttgaag tttagggaca ttttccctcc ctggccccac tcaccccatc 22920 
tgtattgagt attcccctag tgctcatctt tatttgtatg ttaactttca ggaaggggaa 22980 
gcagattgat attcaaaccc agccagtttt cttaaatact ttgtggatgg gattggcttt 23040 
gacagtaaat gaggaaatgt aaaatgtaaa agattctaat ttttaatatt ttaaaggtga 23100 
ggttttctgt tagtacgcag agtgagaggt ttcttactga tgtctgcgta cctagaggaa 23-160 
ggatggctac ttctccaagg cttgctgtta gaagtcagtg acatgggctt aacaagagat 23220 
atgtgctaat gaggttttaa tttcagctta atactgcaaa tcataagtgc atagctttat 23280 
tgttttaaat tcttttagtc ttaatgtttc atttttacca taagttactt tgtataatca 23340 
caaattctaa actagtaaga cgtgaaattt tcttcttctt tgttagagtt tctctgcaaa 23400 
cagtgcccaa aacttcatat tcactttgat cgtatagaca gaaatgaagt tccagaggaa 23 460 
caagaacaca tgaaaaagtg gcttcatgag cgctttgaga taaaagatag gtaagtggta 23 520 
agagctccag catttagaaa gtgcagttca accaaatttt actctcagat cctgcttgaa 23580 
aggagtcttt ttatcttcat tatttagtaa atactaatca tacctgcata gacaagacca 23 640 
catatactta aatgtagcat gtttcatggt gcgttaccct tgtttaacaa ttaagtttaa 23 700 
catcctacat cagtttgcct gttgatttct gtaccatgac aactcaacac agcgatgcgt 23760 
ttattccaaa gtcgatagca cagcaaaagt gaaactaaag tctgtattgt ttcaagaatg 23 820 
ctttttgtga actcgggtta aatcttattc tatcctttcg tgttcacatt gtacattttc 23880 
atgagtcact ataaaaatca tgacatggtg gcctacctgc agtgtttgct ggacagtagg 23940 
ctgctgtgtg ataagagcct ttcctcttca gctacacggg ggacacgagg ctttggggtt 24000 
caagactgaa gcacgggtga gcacaacacc tttgtgttgt gggaaggaag ggaattgttc 24060 
ttttcataat gaaattgtcc cctttcttga gttagtagaa agtattacaa ggatagagag 24120 
ttgaaatgaa gctttatatt agatttatgc cttgtgttgt cacgtgtttc tacctgacat 24180 
aacttttcaa cccagccgct caggattatt ttgatgatgg gaacaatgta agaaggccta 24240 
tgtatcggta actcactgtt gtagctctgt ggaagcggct cacaggcagt agggacgctt 24300 
ctgtgctttt gtgcctgtcc tgctgttaga atcttacaga ggaggatgaa tgaatgaccc 24360 
tttttatttc tcttgtctgc ttttctaatt ttatgggaat aagaactttt ggtaggtctc 24420 
tgtcactggc ctcttgttgt gaagagacac cttgagcaaa gcaactcttc tgagagaaag 24480 
catttagttg gggaattcct tacagcttca gaggttgagt tcgttttcat catgctgagg 24540 
acagggaggc ackcaggcag gagaagtagt tgagagccac attctgatct acaggcagag 24600 
agagacagac tgagcctggc atgggttctt ggaacctcaa agcctctcat ccctacccca 24660 
tctcccgacc ccjtatacaca cttcctccaa caaggctaca ccttctaatc cttcttaaag 24720 
agtcaccaca tc'cagcgact aagcattcag atatgtgaac ctgtcggagc ctttcttact 24780 
cagatcacct caggaggaaa actcctatgc tataagaatt tcttttcttt cgcatctttg 24840 
aaagcttgtt tttgtgtgat tagatcctgg cctcacacat gctcggcaat cattttactg 24900 
ttgagctcca gcbtcagccg ttttcattgg cttatgggat gcgagccatg ggagagaagc 24960 
tagaaggcct ttpgttttat gagtcgggtt ggtggaacca cttacagatg gaagatttac 25020 
aaacaaaaat gaagctgggg ccatcaaggc tcagcactcg ctgctcttcc agagagttca 25080 
ggttcatttc tcagtaacca catggtggct ttgtaaatgt aacttcatat tcaatgaccc 25140 
tgacaccctc ttctggcctc tgtgggcacc agacacaatc atggtataca gacacacaca 25200 
ctagccaaca cccatctaca taaaagtata taaacatatc tttatcttaa aaatccccga 25260 
agtcctcatt aaatatctta gatccccgcc gtgttttgat ttttgtttcc cacgtggtga 25320 
ggatataata tcatgtccaa actgtaagga gtgaatgccc tcccgtgcct ctcggacacc 25380 
tctgcactca tccaagtttt ctaaggagct gtacttgctc agcaagtact caatacctaa 25440 
taaatggttt atgtttgttt caacaccaaa aatgtccaaa actgaaagat caattctgtt 25500 
gttttccttc tggccatagg ttgctcatag agttctatga ttcaccagat ccagaaagaa 25560 
gaaacaaatt tcctgggaaa agtgttcatt ccagactaag tgtgaagaag actttacctt 25620 
cagtgttgat cttggggagt ttgactgcgg tcatgctgat gacggagtcc ggaaggaaac 25680 
tgtacatggg cacctggttg tatggaaccc tccttggctg cctgtggttt gttattaaag 25740 
cataagcaag tagcaggctg cagtcacagt ctcttattga tggctacaca ttgtatcaca 25800 
aattaaataa ggagttttct tgttgttgtt ttttttgttt tgttttgttc 25860 




WO 99/32644 



PCT/IB98/02133 



125 



tgttttaagc cttgatgatt gaacactgga taaagtagag tttgtgacca cagccaacat 25920 
gcatttgatt tggggcaaac acatgtggct tttcaggtgc tggggttgct ggagacatgg 25980 
aagctaagtg gagtttatgc tgtttttttt ttttttttaa tgttttcatg aattaatgtc 26040 
cacttgtaaa gattattgga tactttctgt aattcagaag gttgtatttt aacactagtt 26100 
tgcagtatgt ttcgctatat tggttatctt ccatttgact acttggcagc tcagactctt 26160 
aatactaaag tattttacat tttgaagcta tgtgatactg gttttttgtt gttgttgttg 26220 
ttgttaattt ctgaaagtca atgaaagaca ctgtaatgat gcgttaagat gttccaagaa 26280 
aaaggtgaga atfcattcatg gcaaaaaaga tctgtctagt gtatattttt attatattgc 26340 
tctatttagc taattttctt tatatttgca aaataatgaa catttttaat atttattaaa 26400 
atgcttgatt tgcatacccc cgattctaca gagaataatg tgtaaagtgt cagaatagac 26460 
ttgaagctct gcbgtgactc agtctccttt gtcagagctt ctagtagccc agctactgag 26520 
ctgctttgtt agtacctcca gcacctgagc cgttaagtac ttataaatgc aagggacccg 26580 
ttatcttcat atcggaatag acatgaacag agctctaagg cgatgaaagt ctgccagcat 26640 
cctctctgtc ctcgcacgtg ccttctgcct ggctccattt gctttggcac tgcgttcgat 26700 
ctagagtgta gg'tgctcact gcttatttca gccctggctc tgtggttttg tgtcctccag 26760 
tggtgctgtt cabtgttggg gigcaggegg^'tgctgccctg actcagaggg gcagctccct 26820 
ggctcctgag ggtgagcctt cttggctact acagaagtat tgtgcgtttg tgtatggcaa 26880 
gaaccatcag gattggataa atgtgttatt tctctttgat ttccatggag ccacactgtt 26940 
ggtacatgtc ccctgtgaac agagctacct ttcaggagca catcatactg tcgtgagtca 27000 
cggcacggtg tgtcctgtga gaagaggctt tctaacgtgt gatttgccgt gtttctatgt 27 060 
tgtgatttaa gcgtgattgc ctactagtca ttcaaggtaa catttctgca aatttcatac 27120 
agatttttgt cacaaaatta ctataccaat gatctagttg aaatagacca attgaatcac 27180 
aataaataat tttttttaat tgagggaaaa tttgcttctt gttttttcaa agccagaaaa 27240 
cgagccattt caaacatctt tgaagagtca tgtgctgtca cttgttttct atgtgttagt 27300 
gtctatattc atgtatggat acacatgaac atgtatattc atacacacac gccaatagaa 27360 
tataacagcc taaaaacaat ccagcttgtg tatcatgtta ctgtgctgaa ttgtaatggt 27420 
ttttacttac aaagtgaggc taaaatcgat ttcatgtctt tgttaaatac gtttttttca 27480 
gcaatcctat tagagcttat tttgaccaga tcaaaataag tacaagttca gagactttaa 27540 
atatggctga ggtctagagc gatagctcag tagttaggaa cacatgccac tctttcaagg 27600 
gcttcagttc ccagcactca tatggaggct cacagaaggc tggaattcca gcttcatgga 27660 
attggacaca tcctctagct tccatggatc tgtctgtctg tctctccctt ctctctctct 27720 
ctctctctct ctctctctct ctctctcttt ctcacccttt aaatatcatg gatatgctgt 27780 
gcatttaaat tttaagacac agaaccattg gaattacatg gattatagct gattctcttt 27840 
gaacagggca cagtgttctg cgtaagatct cttgatcatt agcactggac tcactctcct 27900 
cacaagtagc ctktcaaatg tggtattaga aaatacattg tgtcaaaatc tttgaaagat 27960 
gagaagaatc tcbtaaacat gtttattttg acttgacatc actatttcct gaaaattaac 28020 
tgtctatgat tcfttttcaca tagtgtaaga tcttacttgt atcaccatca gcttgcagct 28080 
taggggctgc agttgttctc cttcataaga ctgccatccg tgtgcatgct tttatgtttt 28140 
tcagaaagga tgittgggatg aaagtaagaa aacaaagtct cttcttgtct ctcatgtctg 28200 
tgatcactag catttcacaa ctcagggatt catccatttt ccagcagata aaagggttag 28260 
cgattaaccc tgcattctga gtttagaaag ctacaatatt ttttaaatat tgagcaatga 28320 
ttttaaaaaa atacattgga ataccccaaa ttgtgaagca atccaaaagt tggactgtat 283 80 
aagctaattt gcctacttta aaggatgtga ccctcaccca ggaaacctgt aggatttact 28440 
taacaaggct ttacatgaaa atgccaccgt ggccatttct taaacactgg tggcttcttc 28500 
cagatttcat ttctatgttt gtttgtttgt tgtttttttt ttacttagat tgctgtgagg 28560 
tttttttttt ataacaaata tacatttttt tctttgtcac attacatgct ttgtcaatca 28620 
aatgacctaa ctaggttggc tattaagaaa actacatatt gaaatctgcc aaaatgtcgg 28680 
cataaacaaa ctggctccta attgtgtacc agatctacat ttgaaagaac agaaatgtct 28740 
cacaagacaa taaggtcata tgtaaaacac taaataaact ttaacctcaa caattgtttc 28800 
tgaagtgttg agattaaaga ctgagtgttt gcggaacgtt gacatgtcca tggccaggct 28860 
agtttctcgt tttctttttg tcttaagact aaacattggc tggcttaaaa tattaccagt 28920 
tctatatagt ttacattata gacagaatat ataacattta agtattagta tgaaaatcag 28980 
tactttggtg agactaatat ttggaatatc cagatgattt gatatcatgt aggtaaagta 29040 
agtatttgtg tgactgactg aacttaaaat ctcttattca tatatcatgg ataacagctg 29100 
ggagttgtga cacatggctg ccatccaggc actcggaaaa tccaggtttt gagaaagaga 29160 
gtgttttcca agtcagcctg gtctatatag caagcttcag actagccagg actgtgtacc 29220 
aagatcttct tcacgccacc cacacaaaca agagaagtta tatagagaat tgcttgggat 29280 
ttagttacaa catttttgtt aggatttcat ttaatgggca gggggtgggg gagttagcag 293 40 
tttgcatttt caJgagaatgg gttccattcc cagcatccac agggcagtaa ctgaagtaac 29400 
tgtagttcca ggjgtatccaa caccttcata tggacacaca cgcaggcaaa agaccagcat 29460 
gcataccatt aaaatgaatt attaaaattt ttttaaaaaa gactttgata tatttttagt 29520 
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ttgtgtatgc tgaggtggat ctgttgctct tggcaaactg agtagtaata cgtggagttt 29580 

tataagaggc caaagatctc actttcatat aatatacctt ggataataac ctagttgcta 29640 

cctaagcaaa gaagttcctt ggaagagcta catcacagaa aattggtatc agagcatata 29700 

ctcagtccat ggttatgcca tgagacaggc cagtgtttta cccagactag tgttttgtca 29760 

tatctgaata aaj:aaataac aaagtcgtat tctacaactc taaatcctca tatagaaaat 29820 

aaaataggaa tgcggttgac acaatttgtt gcctgaagtg atgaggaaat taagttacaa 29 880 

agtaatcaag aagtactcag ttactgtgtc ttcttatgta gcttcagaat aagttctgat 29940 

agtgttactg ggkttcctgg tgctttataa tccagtacaa gggaaaatat gaacctcatt 30000 

tttttttata aaaatttggt gacaaaaaag ttacattgtt ttataaagtg ttttgtgttt 30060 

tatgtgtata tgtgtgtttg attgccagat atagctataa agctacatta tttcagatgg 30120 

gtattttgtt tgagatttgc tgggtgttcg ctgtatgctg actgcttctg cacctcttgt 30180 

aggcaccagc ctgtctgcag gcagaccacc caccccatgt tgggtccttc tgcctctggt 30240 

tagccatcca accttggaca gctttaaata aagagtgtcc caggtttatc acctgtagaa 30300 

cattgctacc acctgtagaa cattgctacg atgttagcca caaagaattg ttgagagcat 30360 

gtgaactact gcagatgatg cagcagatgc agagcataga aagaatgtag ctggtgctgg 30420 

ctcgggttcc tgtcctctgt cttcatagat gaccttcagg tcagatgggg tttggataaa" 30480 

gtctagggat gggatgccat ctggtccttt cctgtttctc aaggagacag tggggtgtgg 30540 

aagtggatgc tcgtagttat ggttcagatt gtaggttttg tttaagattc agatcacgtg 30600 

gtgtagagag atacaacaag agaaagaaca tctccattat aagtcacatg gcctctcaac 30660 

tacagctcaa tagaagctgg atcttcaggg ctaggagtgg gctcagtcct cacactttgg 30720 

gcacaggtgg aggggatggc atggcagctg cagtaatgtc tgaggggttc cttctggcat 30780 

tccgcactat ggtacgtttt agactaagtt cgagacaatg cctatcggag tgccccttta 30840 

tccgtgtagt ggggagtgtg gaagtctgtg gagaggtcag tgtggtgact ccatggagtg 30900 

tgggttccca tatgtaaagt tgcactcagc tgagctctat ttgtcgaaag tgtgagggta 30960 

catctgttca ttattttctg acatatatta gagctcagaa ttgtcaggaa tgttcatgat 31020 

taaacattgt tccttgtcat ttggaaccat gcctatattg gcttctgtcc tatgcctgca 31080 

cgagttaact ttctgtctgg gccaggagtc gttttgattt ttacaggcat ctaacatatt 31140 

tctgtttgtc tgktaccccc ttccttctgt tttcctatct gcactgtctc ttcagtctgt 31200 

cgcagaaaga gc^gagaaag tccttctgtc tcccaagacc tcccatggtc tttgtgctct 31260 

tgctgaagac aaccaactag tgtggcaact agtttgatat cttttctctc tataatagaa 31320 

atgagatcat ccaagttacg ctggagaact ggggtgtgtg tctgtgtctc tgtattcccg 31380 

ctaccccttt ctcctagaga aattcattgt agtttggtta gttggaagtc ctggtaaggt 31440 

ggcaggtcct gt^gtatagt gagggcctta ttgctgggct gtgggactcc aaagtgagtc 31500 

tcaagtcccc cgaggactca acctagtacc caatctttcc aaccaccctt tctttccaac 31560 

cctatgacat taaattttca gtgcttcaat tttggagaga cattcagacc atggtaaact 31620 

cacatgagca ccaagttgga ttcttagggg aattctgaag gaactgcttt taaaaataaa 31680 

cagttttcct gatgaggttg cattggaacc agctgtttag aactgaaccc cctcctcccc 31740 

tgtgaatggc ttgaacaggt ggtttatttt ccagaagcaa catagagctg gattaatttg 31800 

ttattaggga actaaattat ctgttgacta tactgctaac ctcaacctca aaatgtagta 31860 

gtggctacca gaccacccca acacctctct gattaaagcc atcaagagga cagtagagga 31920 

cagaagaatg cctatttccc ttcaagacat ttccagagag tccctaatgt ttctacttag 31980 

ctgtgactgc ctggaactta gtacaggctt catttagctg caagggaagc tgacaagtga 32040 

aatcttcagc tttgacatcc tcaataaaat tgaaggtctg agggatgaaa ctattctgag 32100 

aacagtggca gtgttcacac catcttctga atataaatgc tgaagattac taaatggata 32160 

tggtgaattt taccacagtc caaaccaaca gagcaggctg gggatgtggc tgagttggta 32220 

cagtacactt gcctagcata gaacagctgc tgtggtaaca aacattgtta caccaaacac 32280 

tcggtgtgtg gatgcaggag aattagacat ttcaaggcca tcctcagctc tgcagtgagt 32340 

tcaaggccac ctgggggtaa ctgatttcca ccttactttt tgaggcagtg tcactaaacc 32400 

tgtagttcac tgatgaagct acagtggccc acaaactcca gcagccctcc cagggctgtg 32460 

actaccttta aacaacaaca aaacaaaaag tgctggagat ccaaactcaa gttcttatcc 32520 

tcagggcaag cactttactg actgggcttc caggcctgct ctattgctgt ggcaagactc 32580 

gctataacta cctacaattc ctcttcaaca tctatttgga ttctcttttt ctttgttttt 32640 

gagacaaggt ttptctgtgt atctatccct ggctgtgcta gaactcacta tgtaaaccag 32700 

gctgtcctcg aactcaaaga ggtctgcctg cttctgcctc atgagtgctg ggattaaagg 32760 

tgtaccacca cabccctacc tattgggttc ttaaagggca ctttccaaaa ctgatgaagg 32820 

agagttaaaa takggagaat ttttgtaaaa ctaacatgta atgtgaactg tgaatgtatg 32880 

ccccaaaatc ctacatatct acaaaataca cgagtttttc gttattttag tcaccttctc 32940 

tggttggctt ggjttttactc ttgagtaatt tttactaggc aactttcagg acagaatgac 33000 

agttcttgaa gtfctaatctt agtagaaaaa caggcaatag tgagaaagct gtgtcacagt 33060 

gttgttctgc aa'ccactcca gatctattcc ctcccaatgt ctatggagtg cattgtctgc 33120 

ttaggtgctg aataaggaca tgccaacatt tcttatctac tgtaaggcaa aattggtgag 33180 
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cagtgtcact aagcctgtag 
cactagtgag cacactagta 
acccaccttc atctcctcta 



ggtgcttaag 
tttttaatta 
aagggggcat 
tgttgggaac 
ccagcctcac 
ctggatcagg 
cagctgagga 
aggaaaaatg 
tggtgtgatg 
acccagcagt 
gttctagcat 
tgagcacata 
gaactcatgc 
gccaggtccc 
atgtgacact 
tcaatcccaa 
cattctcttg 
aacagaccca 
ggctctgctc 
ggtgctcatc 
ccagtcatta 
ttcaggacac 
cacacatctt 
ttttctctca 
tgtctgccac 
tttatcttta 
cagctcccta 
agcagtccta 
atggtagaat 
caatcatgta 
gtgggtacta 
ggaaaacaag 
taatttgtga 
tcagacagtc 
ttaagattaa 
attttatagg 
gatcacccaa 
ctgagagagc 
ttgacatctt 
tgggagtctc 
ttcattggga 
acctctctgc 
ctctgtaccc 
tgtcacagct 
tcgaggccct 
tactttgaat 
agcttcccaa 
taatatattg 
agccccagaa 
catattatac 
tgtgtgtaca 
ctaatccaag 
gacctccaca 
agtgtttgct 
tcggttatgg 
taaagtgacc 
ctcttggggg 
ctaggggaca 



aa^tctgaag 
ggtgatttaa 
aagatccccg 
tgatcttggg 
actcttatcg 
tactgtgaaa 
aagctgtcaa 
gatggtatca 
tactcacgct 
tctgtgctcc 
ttgctgcaac 
accttttcat 
tcctcctgcc 
actgctccta 
gcacaccagg 
acagcatctg 
tgtagtgctg 
tacataactc 
ttttctcctc 
taacttgatg 
gtgcatctgg 
atggggttat 
aagacatggg 
agtgttttgg 
ctgctgccca 
gccctcctcc 
ctgctttcct 
tggagatgca 
ctgaaaacca 
atgttttgaa 
tgatgtttaa 
acgacaaagg 
gacacattaa 
cacagcttca 
ttttgttctt 
ctacaatgac 
ggattgttgg 
agagacatgc 
gacaggatct 
tagactggat 
tggggtcctg 
ctcctgactg 
tcaaactgaa 
ttgtcacagt 
gggcatgctg 
agggtcttaa 
gcagctggaa 
tacatatatt 
tgatgtatac 
taptgtgtat 
tgtccatgtc 
tagcaattga 
gaiatcagcta 
cctgtattct 
tccccctgtt 
cacctcccac 
aggttgtgga 
gacattgaag 



ttcactgatg 
atagtgtgaa 
ccttgcacct 
gatagctctg 
atgggtttgt 
gctctagagt 
tcccctggaa 
gtatgtctca 
tatggagttg 
ttgtcttcct 
ctaagttcag 
ggtgcccctc 
gacaggaagc 
cccctttgct 
tttgatacat 
ataattectt 
gtacttttgt 
cattgtggga 
cccacccctc 
ggtcggcccc 
caacacagcg 
ttcccagagc 
tatatgtttc 
tgttagaagc 
ggctggatcc 
cattgtcaca 
ttacatggcc 
gcaagcccaa 
ataagcctct 
gagaccctaa 
gtgtcgagat 
agcacaaaaa 
aatacgtgcc 
tagcgcatat 
gtgggatatt 
gcactcaaga 
gcatagtgtg 
agacagtttc 
ggtatttcaa 
ccatcagcac 
tgctccactg 
aaaatcacct 
tgcctgaggt 
gcgtggataa 
gatgcagcat 
caaaacaagc 
gatgagaaaa 
ggtaaactct 
ttgcccagat 
ttacaaatgc 
cacatgtgtg 
acatacatat 
atatagaata 
acattgtgat 
acctgtgaat 
acgttagctg 
cttggaagga 
ggcctcatag 
aggttcacat 
gaccatctag 
atgataggtt 



aagctagagt 
agaaagtgac 
ggttcagatt 
ggggatggtg 
gcacatgagt 
gtcaggctgt 
gagcattgag 
tttgtggggg 
gcggggctgg 
gtgtctgcct 
ccttccattg 
cccttctgag 
agtgggaaac 
cacatctgtt 
gcttttcttt 
aaggactgac 
tatgggtctt 
tgaagtagaa 
ccctctgccc 
ttctctgtct 
atctggtcca 
tgtcccagtc 
tttcctggtc 
taagctcaac 
tgtgtcaagc 
tggctgacag 
ctaaagccta 
gttgtggttc 
tttccactga 
gccgctccac 
gatatgagct 
aataaaaaaa 
taataaaatt 
tcatgtggca 
tctctcaact 
agggtttgat 
ttcactatat 
ccacatgcag 
aaagtaagat 
tgaatggccc 
tctgagggcc 
aggtgtgctc 
tgatttaaaa 
agattggagg 
gatcagctcc 
tcactcgaaa 
ataacacaca 
tcaccactaa 
tggccttaaa 
gagcactagt 
tgtacatgtg 
aagtaggtaa 
tataatatgt 
tgctagttaa 
tatggtaaac 
ctactatgat 
agaactagat 
accagaaagt 
gcttgaacac 
gaggttagct 
ttccctgttt 



gcccactagt 
ttccactgtc 
tcggcagatg 
gcacatgaag 
gcaacaccca 
tgtgaactgc 
tacatgtcat 
caatttgggg 
atccctgtct 
cagtttccta 
taaggatcca 
ctgcaagcat 
tgggctgcaa 
tccacatttt 
acatagcctg 
aatgacagtg 
ctatgctggt 
catgttcatg 
cttcccccgg 
ctaaacattg 
agtctggatg 
gcctcccttt 
tgttttatga 
ttggcctcac 
acatcacttc 
cagtacattc 
ggactgtctg 
ttcctgtcta 
gctctgggtt 
gatgcacagc 
gtgctgtata 
aagttcatta 
attttcacct 
tatttacctc 
ttcctagctg 
tctgtagaca 
ctaaagtagg 
cacaaagcag 
gtgggagtta 
agaaccaatc 
cttgcccatg 
tgggcacctc 
accctaactg 
atgagggagc 
ctctcctgct 
gttgctttcg 
ccccccaaat 
gctatattcc 
atgataattc 
atgtatatgt 
tgtacacaat 
ttaatatata 
attgcacata 
ataccacttt 
ttcaggggat 
tccaagcagt 
tgttagtcat 
agagttgcat 
tgggtcccca 
tacctggagg 
ctgccctggg 



gagccactat 
ccatgaacac 
cagtgagcat 
gtcacatctt 
tacaggccag 
ctggaatagg 
cactatctct 
agtcccacct 
tgccaccagc 
gaaactagaa 
ttgaagtagt 
cagctgttgg 
gatgatccca 
atttcatcct 
ggccagcctt 
gacatgcacc 
ctaatgtgga 
cacacaaaga 
tggatgttag 
aaacaagggg 
tagaaccctt 
ctaattggct 
ctggcctgct 
agtcttgatg 
tttctgaacc 
ttgtatgtag 
tcttcaatca 
atctgcctct 
attcagatta 
agcactttcc 
tgtaaaatgc 
atatttatag 
ttttaaaaat 
aatggggcat 
caacttgtga 
aaacccgggg 
ttttacttct 
ttatgtttta 
caccttgaat 
tacagcaaat 
gtggttgcta 
tggacatatc 
ggcagcacca 
accaggttcc 
gtgacacaat 
ttagtcactt 
gaggattgga 
taaccctttt 
tcctgtcttc 
agtatataaa 
tacaaatggg 
catatttaca 
tattcacata 
tccctgctct 
tgaaaagcct 
agttactgga 
cagttcacag 
ggtttgattg 
gatgatggag 
aaatgggctt 
aatatatcca 



33240 
33300 
33360 
33420 
33480 
33540 
33600 
33660 
33720 
33780 
33840 
33900 
33960 
34020 
34080 
34140 
34200 
34260 
34320 
34380 
34440 
34500 
34560 
34620 
34680 
34740 
34800 
34860 
34920 
34980 
35040 
35100 
35160 
35220 
35280 
35340 
35400 
35460 
35520 
35580 
35640 
35700 
35760 
35820 
35880 
35940 
36000 
36060 
36120 
36180 
36240 
36300 
36360 
36420 
36480 
36540 
36600 
36660 
36720 
36780 
36840 
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aattcctcac tcgaatgcca cagccctgag atgagccttc cccttcacag tggactctac 36900 

cctcaaaccc tgagccagag tagatccttc gtcctttgct agatgtttgg tcacggcagt 36960 

gaggtaaagt accgagcaca gcatcattgt tcccatttta cggatgggaa aactgagcct 37020 

tgggagatgc ccaaggctgt cagccttgag tgtttaaaag ctgcaaggat tggatgcatt 37080 

tgtcctatat cagggaaaca tgatggggat ggagtctggg tgctgaggca ctgagttggc 37140 

aagagggaag gcctggtgtt cactcaaagt cattaaggac caattgtgtc tgcaatcctg 37200 

tgccctccct tgaacaatag gtaagggtca gtgtgagccc tgattactcc cagcagaagg 37260 

atacatctgc tttggagacc aaagtccctt cactgtagaa actaggtcct tcaaggtctc 37320 

agataaaaag acaatgggtt ttgtgctaat ttccacccaa tgggtgtgtg ggtcacacta 37380 

ctcactgggc caacatggtg gtctggacag tcacacagga tgaagtgagg aaaggcaagt 37440 

ctccccggca cctcccctat cgtcactgca aggccaccct gactaagagt cccctcttca 37500 

atgctggccc tacgaatatc ttatcactct tcgtgtttac aagatttctc tccttggaag 37560 

gtgtgatgtg gacagtgaag ctctcaacaa cccctactca cctaagacct agacaagaga 37620 

gtctgggatg gccatgtatg tactccttca gtaattgaca gccattttct ttgtctagga 37680 

agtcttccta caagcttccg caccataacg gtatcggctg tcctgattta cagacatgtc 37740 

attgggactg atatctgcaa caaagagcag atctbcagtc actcatctca ctccagcttg 37800 

ttcacagaac aaaaaggagt tgaagggagg tcttcatcat tgggtgttcc cttccatgga 37860 

gcataggaca ggcatggtgc ccagaacctg cccagcttct agctctccaa gcctcatgct 37920 

ttcctgtctc na&aaaaaaa aaaaaaaaaa 37950 
<210> 184 
<211> 1381 
<212> DNA 

<213> Mus musculus 
<400> 184 

gagccgagag gatgctgctg tccctggtgc tccacacgta ctct atg cgc tac ctg 56 

Met Arg Tyr Leu 



etc 


ccc 


age 


gtc 


etg 


ttg 


ctg 


ggc 


teg 


gcg 


ccc 


ace 


1 
tae 


ctg 


ctg 


gee 


104 


Leu 


Pro 


Ser 


Val 


Leu 


Leu 


Leu 


Gly 


Ser 


Ala 


Pro 


Thr 


Tyr 


Leu 


Leu 


Ala 




5 










10 










15 










20 




tgg 


acg 


etg 


tgg 


egg 


gtg 


etc 


tee 


gcg 


ctg 


atg 


ccc 


gee 


egc 


ctg 


tae 


152 


Trp 


Thr 


Leu 


Trp 


Arg 
25 


Val 


Leu 


Ser 


Ala 


Leu 
30 


Met 


Pro 


Ala 


Arg 


Leu 
35 


Tyr 




cag 


ego 


gtg 


gae 


gae 


egg 


ett 


tae 


tgc 


gtc 


tac 


cag 


aac 


atg 


gtg 


etc 


200 


Gin Arg 


Val 


Asp 


Asp 


Arg 


Leu 


Tyr 


Cys 


Val 


Tyr 


Gin 


Asn 


Met 


Val 


Leu 










40 










45 










50 








ttc 


ttc 


ttc 


gag 


aac 


tac 


ace 


ggg 


gtc 


cag 


ata 


ttg 


eta 


tat 


gga 


gat 


248 


Phe 


Phe 


Phe 
55 


Glu 


Asn 


Tyr 


Thr 


Gly 
60 


Val 


Gin 


He 


Leu 


Leu 
65 


Tyr 


Gly 


Asp 




ttg 


cca 


aaa 


aat 


aaa 


gaa 


aat 


gta 


ata 


tat 


eta 


gcg 


aat 


cat 


eaa 


age 


296 


Leu 


Pro 
70 


Lys 


Asn 

i 


Lys 


Glu 


Asn 
75 


Val 


He 


Tyr 


Leu 


Ala 
80 


Asn 


His 


Gin 


Ser 




aca 


gtt 


gae 


tgg 


att 


gtt 


geg 


gae 


atg 


ctg 


get 


gcc 


aga 


cag 


gat 


gee 


344 


Thr 


Val 


Asp 


Tpp 


He 


Val 


Ala 


Asp 


Met 


Leu 


Ala 


Ala 


Arg 


Gin 


Asp 


Ala 




85 








90 










95 










100 




eta 


gga 


eat 


gtg 


egc 


tac 


gta 


etg 


aaa 


gae 


aag 


tta 


aaa 


tgg 


ett 


ccg 


392 


Leu Gly 


His 


Val 


Arg 


Tyr 


Val 


Leu 


Lys 


Asp 


Lys 


Leu 


Lys 


Trp 


Leu 


Pro 












105 










110 










115 






etg 


tat 


ggg 


ttc 


tac 


ttt 


get 


cag 


eat 


gga 


gga 


att 


tat 


gta 


aaa 


ega 


440 


Leu Tyr 


Gly 


Phe 


Tyr 


Phe 


Ala 


Gin 


His 


Gly 


Gly 


He 


Tyr 


Val 


Lys 


Arg 










120 










125 










130 








agt 


gcc 


aaa 


ttt 


aat 


gat 


aaa 


gaa 


atg 


aga 


age 


aag 


etg 


cag 


age 


tat 


488 


Ser 


Ala 


Lys 
135 


Phe 


Asn 


Asp 


Lys 


Glu 
140 


Met 


Arg 


Ser 


Lys 


Leu 
145 


Gin 


Ser 


Tyr 




gtg 


aac 


gea 


gga 


aca 


ccg 


atg 


tat 


ett 


gtg 


att 


ttc 


cca 


gag 


gga 


aca 


536 


Val 


Asn 


Ala 


Gly 


Thr 


Pro 


Met 


Tyr 


Leu 


Val 


He 


Phe 


Pro 


Glu 


Gly 


Thr 






150 








155 










160 












agg 


tat 


aat 


gea 


aca 


tac 


aca 


aaa 


etc 


ett 


tea 


gcc 


agt 


cag 


gea 


ttt 


584 


Arg 


Tyr 


Asn 


Ala 


Thr 


Tyr 


Thr 


Lys 


Leu 


Leu 


Ser 


Ala 


Ser 


Gin 


Ala 


Phe 




165 










170 










175 










180 




get 


get 


cag 


egg 


ggc 


ett 


gea 


gta 


tta 


aaa 


cac 


gta 


ctg 


aca 


cca 


aga 


632 
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Ala 


Ala 


Gin 


Arg 


Gly 
185 


Leu 


Ala 


Val 


Leu 


Lys 
190 


His 


Val 


Leu 


Thr 


Pro 
195 


Arg 




ata 


aag 


gcc 


act 


cac 


gtt 


get 


ttt 


gat 


tct 


atg 


aag 


agt 


cat 


tta 


gat 


680 


lie 


Lys 


Ala 


Thr 


His 


Val 


Ala 


Phe 


Asp 


Ser 


Met 


Lys 


Ser 


His 


Leu 


Asp 








200 










205 










210 








gca 


att 


tat 


gat 


gtc 


aca 


gtg 


gtt 


tat 


gaa 


ggg 


aat 


gag 


aaa 


ggt 


tea 


728 


Ala 


He 


Tyr 


Asp 


Val 


Thr 


Val 


Val 


Tyr 


Glu 


Gly Asn 


Glu 


Lys 


Gly 


Ser 








215 










220 










225 










gga 


aaa 


tac 


tea 


aat 


cca 


eea 


tec 


atg 


act 


gag 


ttt 


etc 


tgc 


aaa 


cag 


776 


Gly Lys 


Tyr 


Ser 


Asn 


Pro 


Pro 


Ser 


Met 


Thr 


Glu 


Phe 


Leu 


Cys 


Lys 


Gin 






230 










235 










240 












tgc 


cca 


aaa 


ctt 


cat 


att 


eae 


ttt 


gat 


egt 


ata 


gac 


aga 


aat 


gaa 


gtt 


824 


Cys 


Pro 


Lys 


Leu 


His 


He 


His 


Phe 


Asp 


Arg 


He 


Asp 


Arg 


Asn 


Glu 


Val 




245 










250 










255 










260 




cca 


gag 


gaa 


caa 


gaa 


cac 


atg 


aaa 


aag 


tgg 


ctt 


eat 


gag 


cgc 


ttt 


gag 


872 


Pro 


Glu 


Glu 


Gin 


Glu 
265 


His 


Met 


Lys 


Lys 


Trp 
270 


Leu 


His 


'Gin 


Arg 


Phe 
275 


Glu 




ata 


aaa 


gat 


agg 


ttg 


etc 


ata 


gag 


tte 


tat 


gat 


tea 


cca 


gat 


cca 


gaa 


920 


He 


Lys 


Asp 


Arg 


Leu 


Leu 


He 


Glu 


Phe 


Tyr 


Asp 


Ser 


Pro 


Asp 


Pro 


Glu 








280 










285 










290 








aga 


aga 


aac 


aaa 


ttt 


cct 


ggg 


aaa 


agt 


gtt 


cat 


tec 


aga 


eta 


agt 


gtg 


968 


Arg Arg 


Asn 


Lys 


Phe 


Pro 


Gly 


Lys 


Ser 


Val 


His 


Ser 


Arg 


Leu 


Ser 


Val 








295 










300 










305 










aag 


aag 


act 


tta 


cct 


tea 


gtg 


ttg 


ate 


ttg 


ggg 


agt 


ttg 


act 


gcg 


gte 


1016 


Lys 


Lys 


Thr 


Leu 


Pro 


Ser 


Val 


Leu 


He 


Leu 


Gly Ser 


Leu 


Thr 


Ala 


Val 




310 










315 










320 












atg 


ctg 


atg 


acg 


gag 


tec 


gga 


agg 


aaa 


ctg 


tac 


atg 


ggc 


acc 


tgg 


ttg 


1064 


Met 


Leu 


Met 


Thr 


Glu 


Ser 


Gly 


Arg 


Lys 


Leu 


Tyr Met 


Gly 


Thr 


Trp 


Leu 




325 










330 










335 










340 




tat 


gga 


acc 


etc 


ctt 


ggc 


tgc 


ctg 


tgg 


ttt 


gtt 


att 


aaa 


gca 


taa 




1109 


Tyr Gly 


Thr 


Leu 


Leu 


Gly 


Cys 


Leu 


Trp 


Phe 


Val 


He 


Lys 


Ala 
















345 










350 










355 







geaagtagea ggetgcagtc acagtetctt attgatgget acacattgta tcacattgtt 1169 

tcctgaatta aataaggagt tttcttgttg ttgttttttt tgttttgttt tgttctgttt 1229 

taagccttga tgattgaaea ctggataaag tegagtcttg tgaccaeage caaeatgeat 1289 

ttgatttggg geaaaeacat gtggctttte aggtgctggg gttgetggag acatggaage 1349 

taagtggagt ttatgctgtt tttttttttt tt 1381 

<210> 185 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-14-107 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential micro sequencing oligo 4-14-107 ,misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential raicrosequencing oligo 4-14-107 ,mis2 
<400> 185 

etaaacaacc aeeaaatgca taeagcaace aggcaaatgc ctgatag 47 

<210> 186 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 
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<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-14-317 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> l.,23 

<223> potential raicrosequencing oligo 4-14-317 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential micros equencing oligo 4-14-317 .mis2 
<400> 186 

cataacatgc aaggtgggca agaaaaagag gtgggcacag ctcatga 47 

<210> 187 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-14-35 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-14-35. misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-14-35. mis2 
<400> 187 

atccaacaca gaaaccgcta aaaccaggca gaagctgtct gcagaga 47 

<210> 188 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1,.47 

<223> polymorphic fragment 4-20-149 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-20-149 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-20-149 .mis2 
<400> 188 

tttttgctgt gtcttcaaag tgactcttgg tttattgcct gctaagg 47 

<210> 189 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> ' 

<221> allele 

<222> 1..47 

<223> polymorphic fragment 4-20-77 
<221> allele 
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<222> 24 

<223> polymorphic base A 
<221> prinier_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-20-77. misl 
<221> priiner_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-20-77. mis2 
<400> 189 

tgcaacatga agattctgaa gggactttgt tgtctgagaa cacatct 

<210> 190 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-22-174 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-22-174 .misl 
<221> primer_}3ind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-22-174 .mis2 
<400> 190 

ggattgtgca gaagttgcct ttcatgttca aaaatgttaa tttgttt 

<210> 191 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-22-176 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-22-176 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-22-176 .mis2 
<400> 191 

attgtgcaga agttgccttt catattcaaa aatgttaatt tgtttgt 

<210> 192 

<211> 47 

<212> DNA 

<213> Homo Sat>iens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymortahic fragment 4-26-60 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1, .23 
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<223> potential microsequencing oligo 4-26-60. misl 
<221> primer_bind 
<222> 25.-47 

<223> complement potential microsequencing oligo 4-26-60. mis2 
<400> 192 

gatgggaaag tgcatcttaa gacagttagc aggccaagga gcgactt 47 
<210> 193 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-26-72 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-26-72 .misl 
<221> primer_bind 
<222> 25. ,47 

<223> complement potential microsequencing oligo 4-26-72. mis2 
<400> 193 

catcttaaga cagttagcag gccaaggagc gactttaaag ggtgagc 47 
<210> 194 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-3-130 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-3-130. misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-3-130. mis2 
<400> 194 

tattgggcct aaaacagtat tctataaagc ttaaattggt attaact 47 
<210> 195 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-38-63 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-38-63. misl 
<221> prime r_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-38-63, mis2 
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<400> 195 

tataagttat aagaaaatca ggcagaggct aaactttttt tttgttt 47 

<210> 196 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-38-83 
<221> allele 
<222> 24 

<223> polymorphic base G 
<221> primer_bind 
<222> 1. .23 

<223> potential micro sequencing oligo 4-38-83 .misl . 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-38-83. mis2 
<400> 196 

ggcagaggct aaactttttt tttgtttggc aatgctgttg agaatat 47 

<210> 197 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-4-152 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-4-152. misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-4-152 -mis2 
<400> 197 

tactttccca ttgttcctga cttcgttatc ctatatataa acagaaa 47 

<210> 198 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> l.,47 

<223> polymorphic fragment 4-4-187 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-4-187. misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-4-187. mis2 
<400> 198 

tataaacaga aacatggatg agtaaaaaaa aaaaaaaaaa aaaaaaa 47 
<210> 199 
<211> 47 
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<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-4-288 
<221> allele 
<222> 24 

<223> polymorphic base G 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-4-288. misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-4-288. mis2 
<400> 199 

ctgtcatcaa ctaattttca caagtaccta tgttttgatt tcatgta 47 

<210> 200 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-42-304 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-42-304 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-42-304. mis2 
<400> 200 

attatttaaa actatttatg taaccttatt ttcaggggtt tttaatt 47 

<210> 201 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-42-401 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-42-401 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-42-40 l.mis2 
<400> 201 

taagaaagaa ttctgtgttc tggacaaagt ttaaacccac agagcca 47 

<210> 202 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
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<222> 1. .47 

<223> polymorphic fragment 4-43-328 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-43-328 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-43-328 .mis2 
<400> 202 

agaattctgt gttctggcca aagcttaaac ccacagagcc agtttaa 

<210> 203 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1, .47 

<223> polymorphic fragment 4-43-70 
<221> allele 
<222> 24 

<223> polymorphic base G 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-43 -70. misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-43-70. mis2 
<400> 203 

atcgcctcca ttattctcaa aaagaccatg ggacacaaca caagaag 

<210> 204 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-50-209 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-50-209 .misl 
<221> primer_bind 
<222> 25.-47 

<223> coit^lement potential microsequencing oligo 4-50-209 .mis2 
<400> 204 

atatagagtg tgcatccctg acactgaaac tgaaggcttt atggttt 

<210> 205 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-50-293 
<221> allele 
<222> 24 
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<223> polymorphic base G 
<221> primer_bind 
<222> 1. .23 

<223> potential micros equencing oligo 4-50-293 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential micros equencing oligo 4-50-293 .mis2 
<400> 205 

cctgagtccc agggggctga caggggacag tttaaaacat tgatgaa 47 

<210> 206 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-50-323 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1. .23 

<223> potential micros equencing oligo 4-50-323 .misl 
<221> primer_bind 
<222> 25., 47 

<223> complement potential micros equencing oligo 4-50-323 .mis2 
<400> 206 

tttaaaacat tgatgaatct ttactactac aaaagggttc gatttag 47 

<210> 207 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-50-329 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1..23 

<223> potential micro sequencing oligo 4-50-329 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential mi cr ©sequencing oligo 4-50-329 .mis2 
<400> 207 

acattgatga atctttatta ctacaaaagg gttcgattta ggctagc 47 

<210> 208 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1, .47 

<223> polymorphic fragment 4-50-330 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential micros equencing oligo 4-50-330. misl 
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<221> primer^bind 
<222> 25. .47 

<223> complement potential micro sequencing oligo 4-50-330 .mis2 
<400> 208 

cattgatgaa tctttattac tacaaaaggg ttcgatttag gctagcc 47 

<210> 209 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-52-163 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1,.23 

<223> potential mi cr ©sequencing oligo 4-52-163 .misl 
<221> primer.bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-52-163 .mis2 
<400> 209 

gaacaggata ttcttaacta ccaaagaatt ttacacatct attgttt 47 

<210> 210 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-52-88 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-52-88. misl 
<221> primer_bind 
<222> 25.-47 

<223> con^lement potential microsequencing oligo 4-52-88. mis2 
<400> 210 

tccatgtcat tattattcaa aagcttaaaa aatacacaag gtgaaaa 47 

<210> 211 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-53-258 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer__bind 
<222> 1..23 

<223> potential microsequencing oligo 4-53-258 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> con^lement potential microsequencing oligo 4-53-258 .rois2 
<400> 211 
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gagaaatcat gcagagagaa tgcattctca ctcaaatttt aacctaa 47 
<210> 212 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-54-283 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-54-283 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-54-283 .mis2 
<400> 212 

aagtagtttt tcacactttc tctatgatac aatcgatggc ttaatct 47 
<210> 213 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-54-388 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-54-388 .misl 
<221> primer_bind 
<222> 25. .47 

<223> coir^lement potential microsequencing oligo 4-54-388 .mis2 
<400> 213 

ctctctatcg tatacatctt tacacacgct gcagcgccaa gactcca 47 

<210> 214 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-55-70 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-55-70. misl 
<221> primer_bind 
<222> 25, .47 

<223> complement potential microsequencing oligo 4-55-70. mis2 
<400> 214 

tattaagaac ctaggtttta aaaaactctc tatcgtatac atcttta 47 
<210> 215 
<211> 47 
<212> DNA 
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<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-55-95 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential micro sequencing oligo 4-55-95. raisl 
<221> primer„bind 
<222> 25. .47 

<223> coitqplement potential microsequencing oligo 4-55-95. rais2 
<400> 215 

ctctctatcg tatacatctt tacacacgct gcagcgccaa gactcca *' 47 

<210> 216 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-56-159 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-56-159 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-56-159 .mis2 
<400> 216 

aagttttcct tctcttctgt agacgtctcc atgttacagt caactat 47 
<210> 217 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-56-213 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-56-213 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-56-213 .inis2 
<400> 217 

atggctcatg ttcactctgg ttcaccttca gaggagtttg atatttt 47 
<210> 218 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 
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<223> polymorphic fragment 4-58-289 
<221> allele 
<222> 24 

<223> polymorphic base G 
<221> priiner_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-58-289 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-58-289 .mis2 
<400> 218 

catacctgca gcctgctttt ggtgaggggt gactacttta cctgcaa 

<210> 219 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-58-318 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-58-318 .misl 
<221> primer_bind 
<222> 25. .47 

<223> coit^lement potential microsequencing oligo 4-58-318 .mis2 
<400> 219 

tgactacttt acctgcaata tttatttgca agtttatttc ttccttt 

<210> 220 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-60-266 
<221> allele 
<222> 24 

<223> polymorphic base G 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-60-266 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-60-266 .mis2 
<400> 220 

aacaggacca agacactgca ttagataaag tttcagtatt tcttagc 

<210> 221 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-60-293 
<221> allele 
<222> 24 

<223> polymorphic base C 
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<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-60-293 .misl 
<221> priraer_bind 
<222> 25. .47 

<223> con^lement potential microsequencing oligo 4-60-293 .itiis2 
<400> 221 

aagtttcagt atttcttagc agacgaagcc agcaggaagt cctccta 47 

<210> 222 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-84-241 
<221> allele 
<222> 24 

<223> polymorphic base G 
<221> priroer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-84-241 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-84-241 .mis2 
<400> 222 

gaaaaaaaaa tagtgactgc cacggtgaat aattcagttc ttcagaa 47 

<210> 223 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-84-262 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-84-262 .misl 
<221> primer_bind 
<222> 25,. 47 

<223> complement potential microsequencing oligo 4-84-262 .mis2 
<400> 223 

acggtgaata attcagttct tcaaaagcag caacatgatc tcatgga 47 

<210> 224 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-86-206 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-86-206 .misl 
<221> primer_bind 
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<222> 25. .47 . 

<223> complement potential microsequencing oligo 4-86-206 .mis2 

<400> 224 

gtattcaaat caggacacac cacaaatggc atctacacgt taacatt 47 

<210> 225 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-86-309 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer„bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-86-309 .misl 
<221> primer„bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-86-309 .mis2 
<400> 225 

tggctctagg caggccactt tagagagtga ggaaccagag agcagaa 47 
<210> 226 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-88-349 
<221> allele 
<222> 24 

<223> polymorphic base G 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-88-349 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-88-349 .mis2 
<400> 226 

gaaactaaaa gacaatattc agtgtgagat tttccaagtt ctttatg 47 
<210> 227 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-89-87 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-89-87, misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-89-87. mis2 
<400> 227 

ttcttccctg aacgctggtt tcacatagtt tttgtgttga gaataga 47 
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<210> 228 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-123-184 
<221> allele 
<222> 24 

<223> polymorphic base G 
<221> primer_bind 
<222> 1. .23 

<223> potential mi cr ©sequencing oligo 99-123-184 .misl 
<221> primer_bind 
<222> 25.-47 

<223> complement potential micros equencing oligo 99-123-184 .mis2 
<400> 228 

ccagcccaga acattcacca gctgggccaa gagttctgct gggtttt 47 

<210> 229 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-128-202 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential micros equencing oligo 99-128-202 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 99-128-202 .mis2 
<400> 229 

aatgtctgtt tcttagagaa ctgaaacaca cacacataca tacacac 47 
<210> 230 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-128-275 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-128-275 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-128-275 .mis2 
<400> 230 

acacccctac ctcacatgtg tagacaaatg tatgcatata tgtctct 47 

<210> 231 

<211> 47 

<212> DNA 

<213> Homo Sapiens 
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<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-128-313 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-128-313 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-128-313 .mis2 
<400> 231 

tatgtctcta gacagatata cataagattc tatttggcat agaaaaa 47 
<210> 232 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-128-60 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-128-60. misl 
<221> primer„bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 99-128-60 .mis2 
<400> 232 

gcactgtgac ccaggcgcta ggtccctctt acagtgacac tccgaca 47 
<210> 233 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-12907-295 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 99-12907-295 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-12907-295 .mis2 
<400> 233 

gctatatggc attatatctc cacagggcag acctgatgta caagatg 47 
<210> 234 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-130-58 
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<221> allele 
<222> 24 

<223> polymorphic base C 
<221> priiner_bind 
<222> 1..23 

<223> potential microsequencing oligo 99-130-58 .misl 
<221> primer„bind 
<222> 25.-47 

<223> complement potential microsequencing oligo 99-130-58 .mis2 
<400> 234 

aaagcaaaag agcttcaaaa atacttcagg agtgtgcata tggcgag 47 

<210> 235 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-134-362 
<221> allele 
<222> 24 

<223> polymorphic base G 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-134-362 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 99-134-362 .mis2 
<400> 235 

caaaacactc atgttagtta gatgattatt cctattacaa agataag 47 

<210> 236 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-140-130 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-140-130 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-140-130 .mis2 
<400> 236 

tgttcaaaag cagctacaga ccacatgtaa acaattgagc atggctg 47 
<210> 237 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 99-1462-238 
<221> allele 
<222> 24 

<223> polymorphic base G 
<221> primer_bind 
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<222> 1. .23 

<223> potential inicrosequencing oligo 99-1462-238 .misl 
<221> priiner_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-1462-238 .mis2 
<400> 237 

ccctttcaag gttagtaact catgtgctgt gtttctgctt cagaagg 47 

<210> 238 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-147-181 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-147-181 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-147-181 .mis2 
<400> 238 

gtgtcatgaa aaagagcatg ataaaaagaa aaacttaaat ctttata 47 
<210> 239 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-1474-156 
<221> allele 
<222> 24 

<223> polymorphic base G 
<221> priraer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-1474-156 .misl 
<221> primer_bind 
<222> 25.-47 

<223> complement potential microsequencing oligo 99-1474-156 .mis2 
<400> 239 

cttgtactca taagttaaat attgataaca agaagaaata tggactt 47 
<210> 240 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-1474-359 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 99-1474-359 .misl 
<221> primer_bind 
<222> 25. .47 



wo 99/32644 



PCTAB98/02133 



M7 



<223> conrplexnent potential raicrosequencing oligo 99-1474-359 .mis2 
<400> 240 

aaaaaaaatc aaattattgt accaaattcc ctaatatcag atgtgta 47 

<210> 241 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 99-1479-158 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1..23 

<223> potential mi cr ©sequencing oligo 99-1479-158 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential micros equencing oligo 99-1479-158 ,mis2 
<400> 241 

tttaaaaatc cacttgtaat cgccgctaat tggagtgtat attcagg 47 
<210> 242 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 99-1479-379 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> priraer_bind 
<222> l.,23 

<223> potential micro sequencing oligo 99-1479-379 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential raicrosequencing oligo 99-1479-379 .mis2 
<400> 242 

gtagagctgt gtactgaggt cagagaagca gctcatggta cagcctt 47 
<210> 243 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> l.,47 

<223> polymorphic fragment 99-148-129 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential micro sequencing oligo 99-148-129 .misl 
<221> primer„bind 
<222> 25. .47 

<223> complement potential micros equencing oligo 99-148-129 .mis2 
<400> 243 

ttcatatcta tacaaataat tttaaattta atacataggg ctgcaaa 47 
<210> 244 
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<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-148-132 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-148-132 .misl 
<221> primer_bind 
<222> 25. .47 

<223> coit^lement potential microsequencing oligo 99-148-132 .mis2^ ' 
<400> 244 

atatctatac aaataatttt gaacttaata catagggctg caaaaca 
<210> 245 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-148-139 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 99-148-139 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-148-139 .mis2 
<400> 245 

tacaaataat tttgaattta atacataggg ctgcaaaaca aggttga 
<210> 246 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 99-148-140 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 99-148-140 ,misl 
<221> primer_bind 
<222> 25. .47 

<223> con^lement potential microsequencing oligo 99-148-140 .mis2 
<400> 246 

acaaataatt ttgaatttaa tacatagggc tgcaaaacaa ggttgat 
<210> 247 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 
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<221> allele 
<222> 1..47 

<223> polymorphic fragment 99-148-182 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential micros equencing oligo 99-148-182 ,misl 
<221> primer„bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-148-182 .mis2 
<400> 247 

ttgatgttga tatgggcaac tgtatgttgg atggtcccaa agcattc 
<210> 248 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-148-366 
<221> allele 
<222> 24 

<223> polymorphic base G 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 99-148-366 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 99-148-366 .mis2 
<400> 248 

tccttgtcaa aggtctctcc ctggtgctca cggctgccgc ctcaaag 

<210> 249 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-148-76 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-148-76 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-148-76 .mis2 
<400> 249 

tgatagaatg ccttcctgaa ttactactct tgatggcttc ataaaac 

<210> 250 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 99-1480-290 
<221> allele 
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<222> 24 

<223> polymorphic base G 
<221> primer.bind 
<222> 1..23 

<223> potential xaicr ©sequencing oligo 99-1480-290 .misl 
<221> primer^bind 
<222> 25. .47 

<223> complement potential micro sequencing oligo 99-1480-290 .mis2 



<400> 


250 




tgcaccatct tcaccacaac cccgggcaac cactgatcct tttactg 




<210> 


251 




<211> 


47 




<212> 


DNA 




<213> 


Homo Sapiens 




<220> 






<221> 


allele 




<222> 


1. .47 




<223> 


polymorphic fragment 99-1481-285 




<221> 


allele 




<222> 


24 




<223> 


polymorphic base G 




<221> 


primer_bind 




<222> 


1. .23 




<223> 


potential microsequencing oligo 99-1481-285 .misl 




<221> 


primer_bind 




<222> 


25. .47 


-285 .mis2 


<223> 


complement potential microsequencing oligo 99-1481- 


<400> 


251 




tcccataacc tgttttgctt ctcgctctaa cctcaagatg gtataaa 




<210> 


252 




<211> 


47 




<212> 


DNA 




<213> 


Homo Sapiens 




<220> 






<221> 


allele 




<222> 


1. .47 




<223> 


polymorphic fragment 99-1484-101 




<221> 


allele 




<222> 


24 




<223> 


polymorphic base A 




<221> 


primer_bind 




<222> 


1..23 




<223> 


potential microsequencing oligo 99-1484-101 .misl 




<221> 


primer_bind 




<222> 


25. .47 


-101.mis2 


<223> 


complement potential microsequencing oligo 99-1484- 


<400> 


252 




aaaaagatca aatataagca tgtaactcct ctccttaaaa tctcagt 




<210> 


253 




<211> 


47 




<212> 


DNA 




<213> 


Homo Sapiens 




<220> 






<221> 


allele 




<222> 


1. .47 




<223> 


polymorphic fragment 99-1484-32 8 




<221> 


allele 




<222> 


24 




<223> 


polymorphic base G 




<221> 


primer„bind 




<222> 


1. .23 





47 



47 



47 



• 
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<223> potential microsequencing oligo 99-1484-328 .misl 
<221> primer_pind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-1484-328 .mis2 
<400> 253 

ggacacgtgg tcatgaggag tttgaaggga ttcagttttc agatccc 47 

<210> 254 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-1485-251 
<221> allele 
<222> 24 

<223> polymorphic base G 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-1485-251 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-1485-251 .mis2 
<400> 254 

gattgccttg atatatgctc ccagagaacc aagaatgtcc ccttttc 47 

<210> 255 

<211> 47 

<212> DNA 

<213> Homo SaE)iens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-1490-381 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 99-1490-3 81. misl 
<221> primer„bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-1490-381 .mis2 
<400> 255 

tgcacagtgg aaataccatg tcacggtacg ctactgtgca tctcttc 47 

<210> 256 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymort)hic fragment 99-1493-280 
<221> allele 
<222> 24 

<223> polymorfe)nic base A 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-1493-280 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-1493-280 .mis2 
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<400> 256 

ggatgacaga gtattgttgg aggaatgggg tttggctgct tgttttt 47 
<210> 257 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-151-94 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer_bind 
<222> 1. ,23 

<223> potential micro sequencing oligo 99-151-94 .misl 
<221> primer_Dind 
<222> 25. .47 

<223> conplem^nt potential microsequencing oligo 99-151-94 .mis2 
<400> 257 

attgagatca ttgataagga aatattctaa aatttcaaaa tctatat 47 

<210> 258 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-211-291 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> primer„bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-211-291 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-211-291 .mis2 
<400> 258 

ctggttatat capactgacc ttcatgtttt caacaggtca atgcctt 47 

<210> 259 

<211> 45 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .45 

<223> polymorphic fragment 99-213-37 
<221> allele 
<222> 23 

<223> polymorphic base T 
<221> primer _bind 
<222> 1. .22 

<223> potential microsequencing oligo 99-213-37 .misl 
<221> primer_bind 
<222> 24., 45 

<223> complement potential microsequencing oligo 99-213-37 .mis2 
<400> 259 

gtgcttccgg ctgcaggact gttggaggac tccagtgtct gacag 45 
<210> 260 
<211> 47 
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<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 99-221-442 
<221> allele 
<222> 24 

<223> polymorphic base A 
<221> prime rjbind 
<222> 1,.23 

<223> potentijal micros equencing oligo 99-221-442 .misl 
<221> primerjbind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-221-442 .mis2 
<400> 260 

tgcctttgta gaitatgcatg ggaattccat gacctagcca gacgaat 47 

<210> 261 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 99-222-109 
<221> allele 
<222> 24 

<223> polymorphic base C 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 99-222-109 .raisl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 99-222-109 .mis2 
<400> 261 

caggtgagga gtigctggatt ggccacgata tgaatttctt cagcagt 47 

<210> 262 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorjphic fragment 4-14-107, variant version of SEQ ID185 
<221> allele 
<222> 24 

<223> base G ; A in SEQ ID185 
<221> primer_bind 
<222> 1,.23 

<223> potential microsequencing oligo 4-14-107 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-14-107 .mis2 
<400> 262 

ctaaacaacc accaaatgca tacggcaacc aggcaaatgc ctgatag 47 

<210> 263 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
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<222> 1. .47 

<223> polymorjphic fragment 4-14-317, variant version of SEQ ID186 
<221> allele 
<222> 24 

<223> base G [; A in SEQ ID186 
<221> primer Jbind 
<222> 1. .23 

<223> potential micr ©sequencing oligo 4-14-317 .misl 
<221> primer.lbind 
<222> 25. .47 

<223> complement potential micro sequencing oligo 4-14-317 .mis2 
<400> 263 

cataacatgc aaggtgggca agagaaagag gtgggcacag ctcatga 47 

<210> 264 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-14-35, variant version of SEQ ID187 
<221> allele 
<222> 24 

<223> base T ; C in SEQ ID187 
<221> priroerjaind 
<222> 1..23 

<223> potential microsequencing oligo 4-14-35. misl 
<221> primerjbind 
<222> 25. .47 

<223> compleirent potential microsequencing oligo 4-14-35. mis2 
<400> 264 

atccaacaca galaaccgcta aaatcaggca gaagctgtct gcagaga 47 

<210> 265 

<211> 47 

<212> DNA . 

<213> Homo Sajpiens 

<220> ^ 

<221> allele 

<222> 1..47 

<223> polymorphic fragment 4-20-149, variant version of SEQ ID188 
<221> allele 
<222> 24 

<223> base T ; C in SEQ ID188 
<221> priroer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-20-149 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-20-149 .mis2 
<400> 265 

tttttgctgt gtcttcaaag tgattcttgg tttattgcct gctaagg 47 

<210> 266 

<211> 47 

<212> DNA 

<213> Homo Sajpiens 

<220> ^ 

<221> allele 

<222> 1. .47 

<223> polymorphic fragment 4-20-77, variant version of SEQ ID189 
<221> allele I 
<222> 24 
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<223> base T |; A in SEQ ID189 
<221> primerjbind 
<222> 1..23 I 

<223> potential micros equencing oligo 4-20-77 .misl 
<221> priiner_bind 
<222> 25. .47 j 

<223> coit^lement potential raicrosequencing oligo 4-20-77 .inis2 
<400> 266 

tgcaacatga agattctgaa gggtctttgt tgtctgagaa cacatct 47 

<210> 267 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1, .47 

<223> polymorphic fragment 4-22-174, variant version of SEQ ID190 
<221> allele 
<222> 24 

<223> base C ; A in SEQ ID190 
<221> primer_bind 
<222> 1. .23 

<223> potential micros equencing oligo 4-22-174 .misl 
<221> primer_faind 
<222> 25. .47 

<223> complemient potential micros equencing oligo 4-22-174 .mis2 
<400> 267 

ggattgtgca gaagttgcct ttcctgttca aaaatgttaa tttgttt 47 

<210> 268 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-22-176, variant version of SEQ ID191 
<221> allele 
<222> 24 

<223> base G ; A in SEQ ID191 
<221> primer__bind 
<222> 1..23 

<223> potential microsequencing oligo 4-22-176 .misl 
<221> primerjbind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-22-176 .mis2 
<400> 268 

attgtgcaga agttgccttt catgttcaaa aatgttaatt tgtttgt 47 

<210> 269 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-26-60, variant version of SEQ ID192 
<221> allele ' 
<222> 24 

<223> base G ; A in SEQ ID192 
<221> primer _bind 
<222> 1..23 

<223> potential microsequencing oligo 4-26-60. misl 
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<221> priiner_bind 
<222> 25., 47 

<223> complement potential microsequencing oligo 4-26-60. mis2 
<400> 269 

gatgggaaag tgcatcttaa gacggttagc aggccaagga gcgactt 47 

<210> 270 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-26-72, variant version of SEQ ID193 
<221> allele 
<222> 24 

<223> base G \; A in SEQ ID193 
<221> primer_bind 
<222> 1. .23 i 

<223> potential microsequencing oligo 4-26-72. misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-26-72 .mis2 
<400> 270 ; 

catcttaaga cagttagcag gccgaggagc gactttaaag ggtgagc 47 

<210> 271 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-3-130, variant version of SEQ ID194 
<221> allele 
<222> 24 

<223> base G ; A in SEQ ID194 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-3 -130. misl 
<221> primer_jDind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-3-130. mis2 
<400> 271 ; 

tattgggcct aaiaacagtat tctgtaaagc ttaaattggt attaact 47 

<210> 272 ' 

<211> 47 

<212> DNA 

<213> Homo Sc4)iens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-38-63, variant version of SEQ ID195 
<221> allele 
<222> 24 

<223> base G ; A in SEQ ID195 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-38-63. misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-38-63. mis2 
<400> 272 
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tataagttat aagaaaatca ggcggaggct aaactttttt tttgttt 47 

<210> 273 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorbhic fragment 4-38-83, variant version of SEQ ID196 
<221> allele ' 
<222> 24 

<223> base T t; G in SEQ ID196 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-38-83. misl 
<221> prdLmer bind 
<222> 25. .47 I 

<223> complemjsnt potential microsequencing oligo 4-38-83 .niis2 
<400> 273 ! 

ggcagaggct aaactttttt tttttttggc aatgctgttg agaatat 47 

<210> 274 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-4-152, variant version of SEQ ID197 
<221> allele 
<222> 24 

<223> base T ; C in SEQ ID197 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-4-152. misl 
<221> primer bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-4-152. mis2 
<400> 274 

tactttccca ttgttcctga ctttgttatc ctatatataa acagaaa 47 

<210> 275 

<211> 47 

<212> DNA 

<213> Homo Sabiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-4-187, variant version of SEQ ID198 
<221> allele 
<222> 24 

<223> base T ; A in SEQ ID198 
<221> primer„bind 
<222> 1..23 

<223> potential microsequencing oligo 4-4-187 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-4-187. mis2 
<400> 275 

tataaacaga aacatggatg agttaaaaaa aaaaaaaaaa aaaaaaa 47 
<210> 276 
<211> 47 
<212> DNA 
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<213> Homo Sapiens 
<220> ' 
<221> allele 
<222> 1,.47 

<223> polymorphic fragment 4-4-288, variant version of SEQ ID199 
<221> allele 
<222> 24 

<223> base C >; G in SEQ ID199 
<221> primer_pind 
<222> 1. .23 ! 

<223> potentikl microseguencing oligo 4-4-288. misl 
<221> primer_pind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-4-288. mis2 
<400> 276 

ctgtcatcaa ctaattttca caactaccta tgttttgatt tcatgta 47 
<210> 277 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-42-304, variant version of SEQ ID200 
<221> allele : 
<222> 24 

<223> base T {; C in SEQ ID200 
<221> primer tind 
<222> 1. .23 I 

<223> potential microsequencing oligo 4-42-304 .misl 
<221> primer bind 
<222> 25. .47 

<223> complemjent potential microsequencing oligo 4-42-304 .mis2 
<400> 277 I 

attatttaaa acjtatttatg taatcttatt ttcaggggtt tttaatt 47 
<210> 278 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-42-401, variant version of SEQ ID201 
<221> allele 
<222> 24 

<223> base C ; A in SEQ ID201 
<221> primer^bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-42-401. misl 
<221> primer_bind 
<222> 25. .47 

<223> complemient potential microsequencing oligo 4-42-401 .mis2 
<400> 278 I 

taagaaagaa ttjctgtgttc tggccaaagt ttaaacccac agagcca 47 

<210> 279 ' 

<211> 47 

<212> DNA 

<213> Homo Sabiens 

<220> • 

<221> allele 

<222> 1. .47 
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<223> polymorphic fragment 4-43-328, variant version of SEQ ID202 
<221> allele | 
<222> 24 I 

<223> base T j? C in SEQ ID202 
<221> primer_bind 
<222> 1.-23 

<223> potential microsequencing oligo 4-43-328 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-43-328 .mis2 
<400> 279 

agaattctgt gttctggcca aagtttaaac ccacagagcc agtttaa 47 

<210> 280 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1 . . 47 

<223> polymorphic fragment 4-43-70, variant version of SEQ ID203 
<221> allele 
<222> 24 

<223> base C 1; G in SEQ ID203 
<221> priitier_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-43-70. misl 
<221> primer_^ind 
<222> 25. .47 

<223> coraplemfent potential microsequencing oligo 4-43-70. mis2 
<400> 280 

atcgcctcca ttattctcaa aaacaccatg ggacacaaca caagaag 47 

<210> 281 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-50-209, variant version of SEQ ID204 
<221> allele 
<222> 24 

<223> base T ; C in SEQ ID204 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-50-209 .misl 
<221> primer_bind 
<222> 25. .47 ! 

<223> cort^lembnt potential microsequencing oligo 4-50-209 .mis2 
<400> 281 

atatagagtg tgpatccctg acattgaaac tgaaggcttt atggttt 47 

<210> 282 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-50-293, variant version of SEQ ID205 
<221> allele 
<222> 24 

<223> base T ; G in SEQ ID205 
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<221> prinier_bind 
<222> 1..23 

<223> potential microseguencing oligo 4-50-293 .misl 
<221> priiner_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-50-293 .mis2 
<400> 282 

cctgagtccc agggggctga cagtggacag tttaaaacat tgatgaa 47 

<210> 283 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-50-323, variant version of SEQ 'ID206 
<221> allele ' 
<222> 24 

<223> base T [; C in SEQ ID206 
<221> primer bind 
<222> 1. ,23 ! 

<223> potential microsequencing oligo 4-50-323 , misl 
<221> primer bind 
<222> 25. ,47 

<223> complement potential microsequencing oligo 4-50-323 .mis2 
<400> 283 

tttaaaacat tgatgaatct ttattactac aaaagggttc gatttag 47 

<210> 284 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-50-329, variant version of SEQ ID207 
<221> allele 
<222> 24 

<223> base T ; C in SEQ ID207 
<221> primer_bind 
<222> 1..23 : 

<223> potential microsequencing oligo 4-50-329 .misl 
<221> primer^ Dind 
<222> 25.. 47^ 

<223> con^lement potential microsequencing oligo 4-50-329 .mis2 
<400> 284 

acattgatga atctttatta ctataaaagg gttcgattta ggctagc 47 

<210> 285 ' 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-50-330, variant version of SEQ ID208 
<221> allele 
<222> 24 

<223> base T ; A in SEQ ID208 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-50-330 .misl 
<221> primer_bind 
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<222> 25. .47 

<223> coir5)leinent potential microsequencing oligo 4-50-330 .mis2 
<400> 285 

cattgatgaa tctttattac tactaaaggg ttcgatttag gctagcc 47 

<210> 286 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 1 

<221> allele 

<222> 1. .47 

<223> polymoaihic fragment 4-52-163, variant version of SEQ ID209 
<221> allele [ 
<222> 24 

<223> base C j; A in SEQ ID209 
<221> primerfcind 
<222> 1. .23 

<223> potential microsequencing oligo 4-52-163 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-52-163 .mis2 
<400> 286 

gaacaggata ttcttaacta ccacagaatt ttacacatct attgttt 47 
<210> 287 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-52-88, variant version of SEQ ID210 
<221> allele 
<222> 24 

<223> base T C in SEQ ID210 
<221> primerbind 
<222> 1. .23 

<223> potential microsequencing oligo 4-52-88. misl 
<221> primer bind 
<222> 25.. 47 I 

<223> complement potential microsequencing oligo 4-52-88. mis2 
<400> 287 I 

tccatgtcat tabtattcaa aagtttaaaa aatacacaag gtgaaaa 47 

<210> 288 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-53-258, variant version of SEQ ID211 
<221> allele 
<222> 24 

<223> base G ; A in SEQ ID211 
<221> primer„bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-53-258 .misl 
<221> primer_bind 
<222> 25.. 47 ; 

<223> complement potential microsequencing oligo 4-53-258 .mis2 
<400> 288 j 

gagaaatcat gcagagagaa tgcgttctca ctcaaatttt aacctaa 47 
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<210> 289 
<211> 47 
<212> DNA 

<213> Homo Sai)iens 
<220> ! 
<221> allele 
<222> 1, .47 . 

<223> polymorphic fragment 4-54-283, variant version of SEQ ID212 
<221> allele j 
<222> 24 

<223> base T L- A in SEQ ID212 
<221> primer_bind 
<222> 1. .23 i 

<223> potential microsequencing oligo 4-54-283 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-54-283 .mis2 
<400> 289 

aagtagtttt tcacactttc tctttgatac aatcgatggc ttaatct 47 
<210> 290 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-54-388, variant version of SEQ ID213 
<221> allele . 
<222> 24 

<223> base C & A in SEQ ID213 
<221> primer_pind 
<222> 1. .23 

<223> potential microsequencing oligo 4-54-3 88 .misl 
<221> primer_3ind 
<222> 25. .47 

<223> coir^lement potential microsequencing oligo 4-54-388 .mis2 
<400> 290 j 

ctctctatcg taltacatctt tacccacgct gcagcgccaa gactcca 47 
<210> 291 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-55-70, variant version of SEQ ID214 
<221> allele 
<222> 24 

<223> base T ; A in SEQ ID214 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-55-70. misl 
<221> primer_bind 
<222> 25. .47 i 

<223> complement potential microsequencing oligo 4-55-70. rais2 
<400> 291 

tattaagaac ctaggtttta aaatactctc tatcgtatac atcttta 47 
<210> 292 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
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<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-55-95, variant version of SEQ ID215 
<221> allele , 
<222> 24 

<223> base C ; A in SEQ ID215 
<221> primer^bind 
<222> 1..23 

<223> potential microsequencing oligo 4-55-95. misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-55-95. mis2 
<400> 292 

ctctctatcg tatacatctt tacccacgct gcagcgccaa gactcca 

<210> 293 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1,-47 

<223> polymorphic fragment 4-56-159, variant version of SEQ ID216 
<221> allele ^ 
<222> 24 

<223> base T [; C in SEQ ID216 
<221> priraer_lbind 
<222> 1..23 j 

<223> potential microsequencing oligo 4-56-159 .misl 
<221> primer Jaind 
<222> 25.-47 ! 

<223> complement potential microsequencing oligo 4-56-159 .mis2 
<400> 293 

aagttttcct tctcttctgt agatgtctcc atgttacagt caactat 

<210> 294 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-56-213, variant version of SEQ ID217 
<221> allele 
<222> 24 

<223> base G ; A in SEQ ID217 
<221> primer_bind 
<222> 1. .23 ; 

<223> potential microsequencing oligo 4-56-213 .misl 
<221> primer_^ind 
<222> 25. .47 ! 

<223> complemient potential microsequencing oligo 4-56-213 .mis2 
<400> 294 i 

atggctcatg ttjcactctgg ttcgccttca gaggagtttg atatttt 

<210> 295 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-58-289, variant version of SEQ ID218 
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<221> allele 
<222> 24 

<223> base C ; G in SEQ ID218 
<221> priiner_bind 
<222> 1..23 

<223> potential microsequencing oligo 4-58-289 .inisl 
<221> priiner_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 4-58-289 .mis2 
<400> 295 

catacctgca gcctgctttt ggtcaggggt gactacttta cctgcaa 47 

<210> 296 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1, .47 

<223> polymorphic fragment 4-58-318, variant version of SEQ ID219 
<221> allele ! 
<222> 24 i 

<223> base C f; A in SEQ ID219 
<221> primer_bind 
<222> 1..23 i 

<223> potential microsequencing oligo 4-58-318 .misl 
<221> primer_bind 
<222> 25. .47 ■ 

<223> complement potential microsequencing oligo 4-58-318 .mis2 
<400> 296 

tgactacttt acctgcaata tttctttgca agtttatttc ttccttt 47 

<210> 297 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-60-266, variant version of SEQ ID220 
<221> allele 
<222> 24 

<223> base T ; G in SEQ ID220 
<221> primer„bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-60-266 , misl 
<221> primer_pind 
<222> 25.-47 | 

<223> coinplemient potential microsequencing oligo 4-60-266 .mis2 
<400> 297 i 

aacaggacca agjacactgca ttatataaag tttcagtatt tcttagc 47 

<210> 298 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1,.47 

<223> polymorphic fragment 4-60-293, variant version of SEQ ID221 
<221> allele 
<222> 24 

<223> base T ; C in SEQ ID221 
<221> primer_bind 
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<222> 1..23 

<223> potential micros equencing oligo 4-60-293 .misl 
<221> primer_bind 
<222> 25.-47 

<223> complement potential microsequencing oligo 4-60-293 .rais2 
<400> 298 

aagtttcagt atttcttagc agatgaagcc agcaggaagt cctccta 

<210> 299 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-84-241, variant version of SEQ ID222 
<221> allele : 
<222> 24 

<223> base T t G in SEQ ID222 
<221> primer_bind 
<222> 1..23 ! 

<223> potential microsequencing oligo 4-84-241 .misl 
<221> primer_bind 
<222> 25, .47 

<223> complement potential microsequencing oligo 4-84-241 .mis2 
<400> 299 

gaaaaaaaaa tagtgactgc cactgtgaat aattcagttc ttcagaa 

<210> 300 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 4-84-262, variant version of SEQ ID223 
<221> allele 
<222> 24 i 



<222> 1..23 I 

<223> potential microsequencing oligo 4-84-262 .misl 
<221> primer_bind 
<222> 25. .47 j 

<223> complement potential microsequencing oligo 4-84-262 .mis2 
<400> 300 j 

acggtgaata atjtcagttct tcagaagcag caacatgatc tcatgga 

<210> 301 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-86-206, variant version of SEQ ID224 
<221> allele 
<222> 24 

<223> base G ; A in SEQ ID224 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-86-206 .misl 
<221> primer_bind 
<222> 25. .47 . 




<223> 



base G ^ A in SEQ ID223 



wo 99/32644 



PCT/IB98/02133 



166 



<223> corrrplement potential microsequencing oligo 4-86-206 .mis2 
<400> 301 

gtattcaaat caggacacac cacgaatggc atctacacgt taacatt 47 

<210> 302 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorthic fragment 4-86-309, variant version of SEQ ID225 
<221> allele T 
<222> 24 

<223> base T i; A in SEQ ID225 
<221> priiner_bind 
<222> 1, .23 

<223> potential microsequencing oligo 4-86-309 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 4-86-309 .mis2 
<400> 302 

tggctctagg caggccactt tagtgagtga ggaaccagag agcagaa 47 
<210> 303 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1,.47 

<223> polymorphic fragment 4-88-349, variant version of SEQ ID226 
<221> allele , 
<222> 24 

<223> base C i; G in SEQ ID226 
<221> primer bind 
<222> 1..23 I 

<223> potential microsequencing oligo 4-88-349 -misl 
<221> primer bind 
<222> 25. .47 I 

<223> complemlsnt potential microsequencing oligo 4-88-349 .mis2 
<400> 303 

gaaactaaaa gacaatattc agtctgagat tttccaagtt ctttatg 47 

<210> 304 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 4-89-87, variant version of SEQ ID227 
<221> allele 
<222> 24 

<223> base T ; C in SEQ ID227 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 4-89-87. misl 
<221> primer„bind 
<222> 25. .47 j 

<223> complement potential microsequencing oligo 4-89-87. mis2 
<400> 304 

ttcttccctg aacgctggtt tcatatagtt tttgtgttga gaataga 47 
<210> 305 
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<211> 47 
<212> DNA I 
<213> Homo Sapiens 
<220> T 
<221> allele 
<222> 1. ,47 

<223> polymorphic fragment 99-123-184, variant version of SEQ ID228 
<221> allele ' 
<222> 24 

<223> base C ; G in SEQ ID228 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-123-184 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential micrd'sequencing oligo 99-123*-184 .mis2 
<400> 305 

ccagcccaga acattcacca gctcggccaa gagttctgct gggtttt 47 
<210> 306 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-128-202, variant version of SEQ ID229 
<221> allele ^ 
<222> 24 

<223> base C I? A in SEQ ID229 
<221> primer^oind 
<222> l.,23 

<223> potential microsequencing oligo 99-128-202 .misl 
<221> primer_oind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 99-128-202 .mis2 
<400> 306 

aatgtctgtt tcttagagaa ctgcaacaca cacacataca tacacac 47 
<210> 307 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-128-275, variant version of SEQ ID230 
<221> allele 
<222> 24 

<223> base G ; A in SEQ ID230 
<221> primer_)3ind 
<222> 1. .23 

<223> potential microsequencing oligo 99-128-275 .misl 
<221> primer^joind 
<222> 25., 47 | 

<223> complembnt potential microsequencing oligo 99-128-275 .mis2 
<400> 307 1 

acacccctac ctcacatgtg taggcaaatg tatgcatata tgtctct 47 

<210> 308 ' 

<211> 47 

<212> DNA I 

<213> Homo Sapiens 

<220> 
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<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-128-313, variant version of SEQ ID231 
<221> allele 
<222> 24 

<223> base G ; A in SEQ ID231 
<221> primer_bind 
<222> 1. .23 

<223> potential micros equencing oligo 99-128-313 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-128-313 .mis2 
<400> 308 

tatgtctcta gacagatata catgagattc tatttggcat agaaaaa 47 
<210> 309 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-128-60, variant version of SEQ ID232 
<221> allele 
<222> 24 

<223> base T ; C in SEQ ID232 
<221> primer_bind 
<222> 1. .23 I 

<223> potential microsequencing oligo 99-128-60 .misl 
<221> priraer__bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-128-60 .mis2 
<400> 309 

gcactgtgac ccaggcgcta ggttcctctt acagtgacac tccgaca 47 
<210> 310 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-12907-295, variant version of SEQ rD233 
<221> allele 
<222> 24 

<223> base G ; A in SEQ ID233 
<221> primertind 
<222> 1. .23 I 

<223> potentikl microsequencing oligo 99-12907-295 .misl 
<221> primer _bind 
<222> 25. .47 \ 

<223> complement potential microsequencing oligo 99-12907-295 .mis2 
<400> 310 I 

gctatatggc atjtatatctc cacggggcag acctgatgta caagatg 47 

<210> 311 ' 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-130-58, variant version of SEQ ID234 
<221> allele 
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<222> 24 

<223> base T ; C in SEQ ID234 
<221> primer_bind 
<222> 1, .23 

<223> potential microsequencing oligo 99-130-58 .misl 
<221> priiner_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-130-58 .mis2 
<400> 311 

aaagcaaaag agcttcaaaa atatttcagg agtgtgcata tggcgag 
<210> 312 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-134-362, variant version of SEQ ID235 
<221> allele j 
<222> 24 i 

<223> base T !; G in SEQ ID235 
<221> primer_bind 
<222> 1, .23 

<223> potential microsequencing oligo 99-134-362 .misl 
<221> primer_bind 
<222> 25. .47 

<223> coit^lement potential microsequencing oligo 99-134-362 .mis2 
<400> 312 

caaaacactc atgttagtta gattattatt cctattacaa agataag 

<210> 313 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-140-130, variant version of SEQ ID236 
<221> allele 
<222> 24 I 

<223> base T j; C in SEQ ID236 
<221> primer bind 
<222> 1. .23 I 

<223> potential microsequencing oligo 99-140-130 .misl 
<221> primer_f3ind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 99-140-130 .mis2 
<400> 313 I 

tgttcaaaag cagctacaga ccatatgtaa acaattgagc atggctg 

<210> 314 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-1462-238, variant version of SEQ ID237 
<221> allele 
<222> 24 

<223> base C ; G in SEQ ID237 
<221> primer_bind 
<222> 1. .23 
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<223> potential microsequencing oligo 99-1462-238 .misl 
<221> primer _bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-1462-238 .mis2 
<400> 314 

ccctttcaag gttagtaact catctgctgt gtttctgctt cagaagg 47 

<210> 315 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 99-147-181, variant version of SEQ ID238 
<221> allele i 
<222> 24 

<223> base G ; A in SEQ ID238 
<221> primer„bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-147-181 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-147-181 .mis2 
<400> 315 

gtgtcatgaa aaagagcatg atagaaagaa aaacttaaat ctttata 47 

<210> 316 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-1474-156, variant version of SEQ ID239 
<221> allele 
<222> 24 

<223> base T I- G in SEQ ID239 
<221> primer_bind 
<222> 1. .23 ! 

<223> potentikl microsequencing oligo 99-1474-156 .misl 
<221> primer_t>ind 
<222> 25. .47 i 

<223> complement potential microsequencing oligo 99-1474-156 .mis2 
<400> 316 

cttgtactca taagttaaat atttataaca agaagaaata tggactt 47 
<210> 317 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-1474-359, variant version of SEQ ID240 
<221> allele 
<222> 24 

<223> base G ; A in SEQ ID240 
<221> prime r_bind 
<222> 1. .23 

<223> potentijal microsequencing oligo 99-1474-359 .misl 



<222> 25.. 47 | 

<223> complemient potential microsequencing oligo 99-1474-359 .mis2 
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<400> 317 I 

aaaaaaaatc aaiittattgt accgaattcc ctaatatcag atgtgta 4/ 

<210> 318 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1 . . 47 

<223> poiyinort)hic fragment 99-1479-158, variant version of SEQ ID241 
<221> allele 
<222> 24 

<223> base T ; C in SEQ ID241 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-1479-158 .misl 
<221> primer„bind 

<222> 25. .47 . 
<223> complement potential microsequencing oligo 99-1479-158 .mis2 

<400> 318 

tttaaaaatc cacttgtaat cgctgctaat tggagtgtat attcagg 4/ 

<210> 319 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1 . . 47 

<223> polymorphic fragment 99-1479-379, variant version of SEQ ID242 
<221> allele ' 
<222> 24 

<223> base G jr A in SEQ ID242 
<221> primer_bind 
<222> 1. .23 I 

<223> potential microsequencing oligo 99-1479-379 .misl 
<221> primer_bind 

<222> 25. .47 o • o 

<223> complement potential microsequencing oligo 99-1479-379 .mis2 

<400> 319 

gtagagctgt gtactgaggt cagggaagca gctcatggta cagcctt 

<210> 320 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-148-129, variant versxon of SEQ ID243 
<221> allele 
<222> 24 

<223> base G ;; A in SEQ ID243 
<221> primer_bind 
<222> 1. .23 I 

<223> potential microsequencing oligo 99-148-129 .misl 
<221> primer_bind 
<222> 25. .47 ! 

<223> complement potential microsequencing oligo 99-148-129 .mxs2 
<400> 320 i 

ttcatatcta tapaaataat tttgaattta atacataggg ctgcaaa ^/ 
<210> 321 ' 
<211> 47 
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<212> DNA 
<213> Homo Sapiens 
<220> ' 
<221> allele 
<222> 1..47 

<223> polymorphic fragment 99-148-132, variant version of SEQ ID244 
<221> allele ' 
<222> 24 

<223> base T ; C in SEQ ID244 
<221> priraer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-148-132 .raisl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-148-132 .mis2 
<400> 321 

atatctatac aaataatttt gaatttaata catagggctg caaaaca 47 

<210> 322 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1..47 

<223> polymorphic fragment 99-148-139, variant version of SEQ ID245 
<221> allele I 
<222> 24 

<223> base T i; C in SEQ ID245 
<221> primer_bind 
<222> 1..23 I 

<223> potential microsequencing oligo 99-148-139 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-148-139 .mis2 
<400> 322 

tacaaataat tttgaattta atatataggg ctgcaaaaca aggttga 47 

<210> 323 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-148-140, variant version of SEQ ID246 
<221> allele 
<222> 24 

<223> base G ; A in SEQ ID246 
<221> primer_bind 
<222> 1..23 

<223> potential microsequencing oligo 99-148-140 .misl 
<221> primer bind 
<222> 25. .47 I 

<223> coii^lemlent potential microsequencing oligo 99-148-140 .mis2 
<400> 323 

acaaataatt ttjaatttaa tacgtagggc tgcaaaacaa ggttgat 47 

<210> 324 ' 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
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<222> 1 . . 47 

<223> polymorphic fragment 99-148-182, variant version of SEQ ID247 
<221> allele 
<222> 24 

<223> base G ; A in SEQ ID247 
<221> primer _bind 

<222> 1..23 . 
<223> potential micros equencing oligo 99-148-182 .misl 
<221> primer_bind 

<222> 25. .47 . 
<223> complement potential micros equencing oligo 99-148-182 .mis2 

<400> 324 

ttgatgttga tatgggcaac tgtgtgttgg atggtcccaa agcattc 4/ 

<210> 325 

<211> 47 

<212> DNA , 

<213> Homo Sapiens 

<220> ' 

<221> allele 

<222> 1. .47 . ^ 

<223> polymort>hic fragment 99-148-366, variant version of SEQ ID248 

<221> allele 
<222> 24 ' 

<223> base T ; G in SEQ ID248 
<221> primer_bind 
<222> 1..23 i 

<223> potential microsequencing oligo 99-148-3 66. misl 
<221> primer_bind 

<222> 25. .47 . 
<223> complement potential microsequencing oligo 99-148-366 .mis2 

<400> 325 

tccttgtcaa aggtctctcc ctgttgctca cggctgccgc ctcaaag 

<210> 326 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-148-76, variant version of SEQ ID249 
<221> allele 
<222> 24 

<223> base T ; C in SEQ ID249 
<221> primer_jDind 
<222> 1. .23 

<223> potential microsequencing oligo 99-148-76 .misl 
<221> primer_bind 
<222> 25 . . 47 

<223> contplemtent potential microsequencing oligo 99-148-76 .mis2 
<400> 326 

tgatagaatg ccttcctgaa ttattactct tgatggcttc ataaaac 47 

<210> 327 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 

<222> 1. .47 ^ ^^^^ 

<223> polymorphic fragment 99-1480-290, variant version of SEQ ID250 

<221> allele 
<222> 24 
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<223> base T ; G in SEQ ID250 
<221> priiner_bind 
<222> 1. .23 

<223> potential micro sequencing oligo 99-1480-290 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-1480-290 .mis2 
<400> 327 

tgcaccatct tcaccacaac ccctggcaac cactgatcct tttactg 47 

<210> 328 , 

<211> 47 

<212> DNA , 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-1481-285, variant version of SEQ ID251 
<221> allele : 
<222> 24 

<223> base T G in SEQ ID251 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-1481-285 .misl 
<221> primer_bind 
<222> 25.. 47 

<223> complement potential microsequencing oligo 99-1481-285 .rais2 
<400> 328 

tcccataacc tgttttgctt ctctctctaa cctcaagatg gtataaa 47 
<210> 329 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-1484-101, variant version of SEQ ID252 
<221> allele 
<222> 24 
<223> base C 
<221> primer^foind 
<222> 1. .23 1 

<223> potential microsequencing oligo 99-1484-101 .misl 
<221> primer bind 
<222> 25..47 ! 

<223> complement potential microsequencing oligo 99-1484-101 .mis2 
<400> 329 

aaaaagatca aatataagca tgtcactcct ctccttaaaa tctcagt 47 
<210> 330 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-1484-328, variant version of SEQ ID253 
<221> allele 
<222> 24 

<223> base C ; G in SEQ ID253 
<221> primer jDind 
<222> 1. .23 

<223> potential microsequencing oligo 99-1484-328 .misl 



A in SEQ ID252 
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<221> primer.bind 
<222> 25.. 47 

<223> complembnt potential itiicroseguencing oligo 99-1484-328 .inis2 
<400> 330 j 

ggacacgtgg tcktgaggag tttcaaggga ttcagttttc agatccc 47 

<210> 331 I 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-1485-251, variant version of SEQ ID254 
<221> allele 
<222> 24 

<223> base T ; G in SEQ ID254*' 
<221> primer_bind 
<222> 1. .23 

<223> potential micros equencing oligo 99-1485-251 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential raicrosequencing oligo 99-1485-251 .mis2 
<400> 331 

gattgccttg atatatgctc ccatagaacc aagaatgtcc ccttttc 47 

<210> 332 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> I 

<221> allele 

<222> 1..47 

<223> polymorbhic fragment 99-1490-381, variant version of SEQ ID255 
<221> allele F 
<222> 24 

<223> base T ; C in SEQ ID255 
<221> primer_bind 
<222> 1..23 ^ 

<223> potential microsequencing oligo 99-1490-381 .misl 
<221> primer_bind 
<222> 25., 47 

<223> complement potential microsequencing oligo 99-1490-3 81 .mis2 
<400> 332 

tgcacagtgg aaataccatg tcatggtacg ctactgtgca tctcttc 47 

<210> 333 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> allele 
<222> 1. .47 

<223> polymoiTphic fragment 99-1493-280, variant version of SEQ ID256 
<221> allele 
<222> 24 

<223> base G I; A in SEQ ID256 
<221> primer_bind 
<222> 1. .23 ' 

<223> potential microsequencing oligo 99-1493-280 .misl 
<221> primer_bind 
<222> 25. .47~ 

<223> coraplernjent potential raicrosequencing oligo 99-1493-280 .mis2 
<400> 333 I 
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ggatgacaga gtittgttgg agggatgggg tttggctgct tgttttt 47 

<210> 334 ' 

<211> 47 

<212> DNA ! 

<213> Homo Sapiens 

<220> I 

<221> allele 

<222> 1, .47 

<223> polymorphic fragment 99-151-94, variant version of SEQ ID257 
<221> allele 
<222> 24 

<223> base G ; A in SEQ ID257 
<221> primer_bind 
<222> 1. .23 

<223> potential micros equencing oligo 99-151-94 .misl 
<221> primer_bind 
<222> 25. .47 

<223> coxnplement potential micros equencing oligo 99-151-94 .mis2 
<400> 334 

attgagatca ttgataagga aatgttctaa aatttcaaaa tctatat 47 
<210> 335 
<211> 47 
<212> DNA 

<213> Homo Sapiens 
<220> ! 
<221> allele 
<222> 1. .47 

<223> polymorphic fragment 99-211-291, variant version of SEQ ID258 
<221> allele | 
<222> 24 i 

<223> base G p A in SEQ ID258 
<221> primer_i3ind 
<222> 1. .23 j 

<223> potential micros equencing oligo 99-211-291 .misl 
<221> primer_bind 
<222> 25. .47 

<223> complement potential micros equencing oligo 99-211-291 .mis2 
<400> 335 

ctggttatat cagactgacc ttcgtgtttt caacaggtca atgcctt 47 
<210> 336 
<211> 46 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> allele 
<222> 1. .46 

<223> polymorphic fragment 99-213-37, variant version of SEQ ID259 
<221> allele 
<222> 23 j 

<223> base GCl ; T in SEQ ID259 
<221> primer^oind 
<222> 1. ,22 

<223> potential micros equencing oligo 99-213-37 .misl 
<221> primer_iDind 
<222> 24. .46 j 

<223> complement potential micros equencing oligo 99-213-37 .mis2 
<400> 336 I 

gtgcttccgg ctgcaggact gtgcggagga ctccagtgtc tgacag 46 
<210> 337 
<211> 47 
<212> DNA 
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<213> Homo Sajpiens 
<220> ! 
<221> allele 
<222> 1..47 

<223> polymorphic fragment 99-221-442, variant version of SEQ ID260 
<221> allele 
<222> 24 

<223> base C ; A in SEQ ID260 
<221> primer_bind 
<222> 1..23 

<223> potential micros equencing oligo 99-221-442 .misl 
<221> primer.bind 
<222> 25. .47 

<223> complement potential microsequencing oligo 99-221-442 .mis2 
<400> 337 

tgcctttgta gatatgcatg ggacttccat gacctagcca gacgaait 47' 

<210> 338 

<211> 47 

<212> DNA 

<213> Homo Sapiens 

<220> !^ 

<221> allele 

<222> 1. .47 . 

<223> polymorbhic fragment 99-222-109, variant version of SEQ ID261 
<221> allele ^ 
<222> 24 

<223> base T ;; C in SEQ ID261 
<221> primer_bind 
<222> 1. .23 

<223> potential microsequencing oligo 99-222-109 .misl 
<221> primer_bind 
<222> 25.-47 

<223> complement potential microsequencing oligo 99-222-109 .mis2 
<400> 338 

caggtgagga gtgctggatt ggctacgata tgaatttctt cagcagt 47 

<210> 339 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .18 

<223> upstream amplification primer for SEQ 185, SEQ 262, SEQ 186, SEQ 263, 
SEQ 187, SEQ 1264 
<400> 339 

tctaacctct catccaac 18 
<210> 340 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> I 
<221> primer J^ind 
<222> 1, .19 ; 

<223> upstream amplification primer for SEQ 188, SEQ 265, SEQ 189, SEQ 266 
<400> 340 

gttatcgtga gactttttc 19 

<210> 341 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 
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<221> priiner_bind 
<222> 1..18 

<223> upstream amplification primer 
<400> 341 

tgctggtgct gtgataac 

<210> 342 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..18 

<223> upstream amplification primer 
<400> 342 

tacagccctg taagacac 

<210> 343 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..19 

<223> upstream amplification primer 
<400> 343 

cagtatgttc aatgcacag 

<210> 344 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .18 

<223> upstreaim amplification primer 
<400> 344 

aaaacatcga catgggac 

<210> 345 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .18 

<223> upstream amplification primer 
SEQ 199, SEQ 1276 
<400> 345 

agcatttcga gtcatgtg 

<210> 346 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .18 

<223> upstream amplification primer 
<400> 346 

ccctctttcc tcatgtag 

<210> 347 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 



for SEQ 190, SEQ 267, SEQ 191, SEQ 268 

18 



for SEQ 192, SEQ 269, SEQ 193, SEQ 270 

18 



for SEQ 194, SEQ 271 

19 



for SEQ 195, SEQ 272, SEQ 196, SEQ 273 

18 

for SEQ 197, SEQ 274, SEQ 198, SEQ 275, 

18 

for SEQ 200, SEQ 277, SEQ 201, SEQ 278 

18 



wo 99/32644 



179 



PCT/IB98/02133 



<221> primer_bind 
<222> 1..19 

<223> upstream amplification primer for SEQ 202, SEQ 279, SEQ 203, SEQ 280 
<400> 347 

taactcgtaa acagagaac 19 
<210> 348 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> I 
<221> primer_3ind 
<222> 1. .18 

<223> upstream amplification primer for SEQ 204, SEQ 281, SEQ 205, SEQ 282, 
SEQ 206, SEQ j283 , SEQ 207, SEQ 284, SEQ 208, SEQ 285 
<400> 348 i 

gcgtattgaa gctctttg ** ' 18 

<210> 349 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. .18 

<223> upstream amplification primer for SEQ 209, SEQ 286, SEQ 210, SEQ 287 
<400> 349 

aacacgggga ttttaggc 18 
<210> 350 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> i 
<221> primer_bind 
<222> 1..19 I 

<223> upstream amplification primer for SEQ 211, SEQ 288 
<400> 350 j 

cacatactaa ggbtaatgg 15 
<210> 351 
<211> 18 
<212> DNA 

<213> Homo Sajpiens 
<220> 

<221> primer^isind 
<222> 1..18 

<223> upstream amplification primer for SEQ 212, SEQ 289, SEQ 213, SEQ 290 
<400> 351 

gttgctggaa cctatttg 18 
<210> 352 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. .18 

<223> upstream an¥)lif ication primer for SEQ 214, SEQ 291, SEQ 215, SEQ 292 
<400> 352 

tcgatggctt aatctacc 18 

<210> 353 

<211> 18 

<212> DNA . 

<213> Homo Sajpiens 

<220> 
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<221> primerjbind 
<222> 1. .18 I 

<223> upstream amplification primer for SEQ 216, SEQ 293, SEQ 217, 

<400> 353 ] 

aaagaggagt aaatgggg 

<210> 354 ' 

<211> 18 

<212> DNA I 

<213> Homo Sapiens 

<220> I 

<221> prime r_bind 

<222> 1. .18 

<223> upstream art^lif ication primer for SEQ 218, SEQ 295, SEQ 219, 
<400> 354 

tccccacagc taagagcc 

<210> 355 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1.-18 

<223> upstream amplification primer for SEQ 220, SEQ 297, SEQ 221, 
<400> 355 

atacctaatt tcaggggg 
<210> 356 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primerjbind 
<222> 1. .19 i 

<223> upstrealm amplification primer for SEQ 222, SEQ 299, SEQ 223, 

<400> 356 j 

ttaacagagt adcttggag 

<210> 357 : 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .18 

<223> upstream amplification primer for SEQ 224, SEQ 301, SEQ 225, 
<400> 357 

gtacagcctt ttgcttac 

<210> 358 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primerjbind 
<222> 1. .18 

<223> upstrealm amplification primer for SEQ 226, SEQ 303 
<400> 358 
aacgtgtcat 
<210> 359 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primerjbind 



SEQ 294 
18 



SEQ 296 
18 



SEQ 298 
18 



SEQ 300 
19 



SEQ 302 
18 



gaaagcc 



18 
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<222> 1. .19 , 

<223> upstream amplification primer for SEQ 227, SEQ 304 
<400> 359 i 

gctgatgagt tagataacc 1^ 

<210> 360 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .18 

<223> upstream amplification primer for SEQ 228, SEQ 305 
<400> 360 

aaagccagga ctagaagg 
<210> 361 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_!Dind 
<222> 1,.18 

<223> upstream axtplif ication primer for SEQ 229, SEQ 306, SEQ 230, SEQ 307, 

SEQ 231, SEQ B08, SEQ 232, SEQ 309 

<400> 361 ; 

gaccagggtt ta^gttag 

<210> 362 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .18 

<223> upstream amplification primer for SEQ 233, SEQ 310 
<400> 362 

tctgttagga cctgtgag 

<210> 363 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..19 

<223> upstream an^jlif ication primer for SEQ 234, SEQ 311 
<400> 363 I 

ccataacagc ta^tacaac ^5 
<210> 364 ' 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> : 
<221> prime r_bind 
<222> l.,18 

<223> upstream amplification primer for SEQ 235, SEQ 312 

<400> 364 I 

tggaaaggta ctcagaag 

<210> 365 

<211> 19 

<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
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<222> 1. .19 

<223> upstream amplification primer for SEQ 236, SEQ 313 
<400> 365 

agagcatagt ataaagcag ^,9 

<210> 366 

<211> 19 

<212> DMA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> upstream amplification primer for SEQ 237, SEQ 314 
<400> 366 ! 

ctagaagtag ctttaacag 29 

<210> 367 

<211> 19 

<212> DNA . 

<213> Homo Sapiens 

<220> 1 

<221> primer_oind 

<222> 1. .19 

<223> upstream amplification primer for SEQ 238, SEQ 315 
<400> 367 

gcagccaatc ttatatttc lo 

<210> 368 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> upstream amplification primer for SEQ 239, SEQ 316, SEQ 240, SEQ 317 
<400> 368 

aaggttgtag agtagaaag ^9 

<210> 369 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..18 

<223> upstrea^ti amplification primer for SEQ 241, SEQ 318, SEQ 242, SEQ 319 
<400> 369 

caactgacac tataaccc 18 

<210> 370 ' 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..18 

<223> upstream amplification primer for SEQ 243, SEQ 320, SEQ 244, SEQ 321, 
SEQ 245, SEQ 322, SEQ 246, SEQ 323, SEQ 247, SEQ 324, SEQ 248, SEQ 325, SEQ 
249, SEQ 326 
<400> 370 

cagtggagtg tttatgtg 

<210> 371 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 
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<221> priiner_bind 
<222> 1. ,19 

<223> upstream amplification primer for SEQ 250, SEQ 327 
<400> 371 

ttgcacaaaa ggtatagag 19 

<210> 372 

<211> 19 

<212> DNA . 

<213> Homo Saipiens 

<220> I 

<221> primer Jbind 

<222> 1. .19 i 

<223> upstream amplification primer for SEQ 251, SEQ 328 
<400> 372 j 

aggctcccct tttgagttg 19 

<210> 373 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer„bind 
<222> 1. .18 

<223> upstream amplification primer for SEQ 252, SEQ 329, SEQ 253, SEQ 330 
<400> 373 

atcctttcta gctgggag 18 
<210> 374 
<211> 20 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_ibind 
<222> 1. .20 : 

<223> upstream amplification primer for SEQ 254, SEQ 331 
<400> 374 ! 

gtttaagaat gt;gtgatggg 20 

<210> 375 I 

<211> 19 

<212> DNA 

<213> Homo Saipiens 

<220> i 

<221> primer Jbind 

<222> 1, .19 

<223> upstream an^lif ication primer for SEQ 255, SEQ 332 
<400> 375 

aaggcaacag cgttgtgac 19 

<210> 376 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .18 

<223> upstream amplification primer for SEQ 256, SEQ 333 
<400> 376 

ttttgggggt tttcagtg 18 
<210> 377 
<211> 18 
<212> DNA 

<213> Homo Ss(piens 
<220> 

<221> primer. bind 
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<222> 1. .18 

<400> 377^"^ amplification primer for SEQ 257, SEQ 334 
aacacaacag caaatccc 

<210> 378 18 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .18 

<400> 378^^^^ amplification primer for SEQ 258, SEQ 335 
tccttacttg taaccccc 

<210> 379 18 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> l.,20 

<400> 379^^^^ amplification primer for SEQ 259, SEQ 336 
atactggcag cgtgtgcttc 

<210> 380 20 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> ! 
<221> primer^bind 
<222> 1. .19 I 

<400> Sr^"^^ amplification primer for SEQ 260, SEQ 337 
ccctttttct tcactgttc 

<210> 381 19 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .20 

<400> amplification primer for SEQ 261, SEQ 338 

aggggagatg agggaagttg 

<210> 382 20 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer^bind 
<222> 1, .20 ! 

llolBl^llliT, P--- for SEQ 185, SEQ 262, SEQ 186, SEQ 263, 

<400> 382 i 
gactgtatcc tttgatgcac 

<210> 383 20 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer^bind 
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<222> 1. .20 

<223> downstream amplification primer for 
<400> 383 

gcataattgt gcttgactgg 
<210> 384 



SEQ 188, SEQ 265, SEQ 189, SEQ 266 



20 



SEQ 267, SEQ 191, SEQ 268 



18 



18 



SEQ 194, SEQ 271 



<211> 18 
<212> DNA 
<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. .18 

<223> downstream amplification primer for SEQ 190 
<400> 384 

tgctgagagg agcttttg 
<210> 385 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> j 
<221> primer_bind 
<222> 1. .18 

<400> 38^^^^f^"^ amplification primer for SEQ 192, SEQ 269, SEQ 193, SEQ 270 

tgaggactgc ta^gaaag 
<210> 386 
<211> 20 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1..20 

<223> downstream amplification primer for 
<400> 386 

acaaaatcag gaacaatggg 
<210> 387 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. .18 

<400> 38^;^^^^^^ amplification primer for SEQ 195, SEQ 272, SEQ 196, SEQ 273 

ttgcattttc cccccaac 
<210> 388 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> I 
<221> primer_bind 
<222> 1, .18 

<223> downstream amplification primer for 
SEQ 199, SEQ 276 
<400> 388 

accatttgga caatgggg 
<210> 389 
<211> 20 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 



20 



SEQ 197, SEQ 274, SEQ 198, SEQ 275, 



18 
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<222> 1. .20 

<223> downstream amplification 
<400> 389 

gctcttaaac tggctctgtg 
<210> 390 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> j 
<221> primer^bind 
<222> 1, .18 

<223> downstrieam amplification 
<400> 390 

ggcatgactt cacgtttc 
<210> 391 ' 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. ,18 

<223> downstream amplification 
SEQ 206, SEQ 283, SEQ 207, SEQ 
<400> 391 
aggatcttct acagtcac 
<210> 392 
<211> 20 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. .20 

<223> downstream amplification 
<400> 392 j 
tggtagcgtt tgkaatcatc 
<210> 393 ! 
<211> 20 
<212> DNA 

<213> Homo Sapiens 

<22o> ; 

<221> primerjbind 
<222> 1. .20 ! 

<223> downstream amplification 
<400> 393 
tataagcaca aataggttcc 
<210> 394 
<211> 18 
<212> DNA 
<213> Homo Sapiens 
<220> 

<221> primerjbind 
<222> 1, .18 

<223> downstream arnplif ication 
<400> 394 
gaataactga ggggagtg 
<210> 395 
<211> 19 
<212> DNA 
<213> Homo Sapiens 
<220> j 
<221> primerjbind 



primer for SEQ 200, SEQ 277, SEQ 201 



, SEQ 278 
20 



primer for SEQ 202, SEQ 279, SEQ 203. 



SEQ 280 
18 



primer for SEQ 204, SEQ 281, SEQ 205, 
284, SEQ 208, SEQ 285 



SEQ 282, 
18 



primer for SEQ 209, SEQ 286, SEQ 210, 



SEQ 2 87 
20 



primer for SEQ 211, SEQ 288 



20 



primer for SEQ 212, SEQ 289, SEQ 213, 



SEQ 290 
18 
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<222> 1. .19 

<223> downstream amplif ica^^ on r^v.^« 

<400> 395 i ^ ^^^«tion primer for SEQ 214, SEQ 291, SEQ 215, SEQ 292 

gtgaatctcc ttttccaaa 

<210> 396 ^ 19 

<211> 18 

<212> DKA 

<213> Homo Sai>iens 

<220> j 

<221> primer_bind 

<222> 1. ,18 i 

till ??r"r '"■'""-"on prW ,or SEO 216. SE<, a«, 217, SE« «4 

Ctaaggtgtt gtagacag 

<210> 397 18 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .20 

?^ol "^lifi-tion pri^r ^^^^ 

cacctcgata aatcaagtcc 

<210> 398 20 

<211> 20 . 

<212> DNA 

<213> Homo Sapiens 

<220> I 

<221> primertind 

<222> 1. .20 

:ili: ??r"r '"^^^ 2.„, 3,,, 

gttcacttaa ttfctgttgag 

<210> 399 ' 20 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer__bind 
<222> 1. .18 

:ill: sr"'"" "^""""o„ ,,,, 

cgccttttct gaaaggtg 

<210> 400 ifl 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> I..I8 

:lllt Jr""™ Pri... fo, SEQ 2.., SEQ 301, SEQ 225, SEQ 302 

attttctgca cagcagcg 

<210> 401 r no 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> ! 

<221> primer kind 

<222> 1. .19 r 
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<223> downs trfeam airplif ication primer for SEQ 226, SEO 3 03 
<400> 401 

tattttctag ctfcttctgg 

<210> 402 ' -^^ 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> priraer_bind 
<222> 1. .19 

<223> downstream amplification primer for SEQ 227, SEQ 304 
agcaagagtg attgtaaag 

<210> 403 -^^ 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .18 

<223> downstream amplification primer for SEQ 228, SEO 305 
<400> 403 

tattcagaaa ggagtggg 
<210> 404 I 
<211> 18 
<212> DNA 

<213> Homo Sabiens 
<220> 

<221> primerbind 
<222> 1. ,18 

sJq'23??"^S1^,^J'23'2'?1^S fSs"'" ''''' 
<400> 404 

agagcgttct tgcctttc 

<210> 405 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .20 

<223> downstream amplification primer for SEQ 233, SEO 310 
<400> 405 ^ 

ggtaacccta aaatgttatc ^- 

<210> 406 2° 

<211> 21 

<212> DNA 

<213> Homo Sapiens 

<220> j 

<221> primer bind 

<222> 1..21 I 

<223> downstrfeam airplif ication primer for SEQ 234, SEQ 311 
<400> 406 

agaaaccata agggtatatt g 

<210> 407 1 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. , 19 



wo 99/32644 



PCT/IB98/02133 



189 



<400> 40^^^''®'^ amplification primer for SEQ 235, SEQ 312 
acagtgcaaa ggttatatc 

<210> 408 19 

<211> 21 

<212> DMA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..21 

<400> 4°^^^"^"* amplification primer for SEQ 236, SEQ 313 
gaacaacctt gaattagctt g 

<210> 409 21 
<211> 21 
<212> DNA 

<213> Homo Sapiens 
<220> • 
<221> primer_j3ind 
<222> 1. .21 ■ 

<400> 4Sr^^''f^'^ amplification primer for SEQ 237, SEQ 314 
gattccagaa gtccatttca g 

<210> 410 = 21 

<211> 21 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .21 

<400> Jj^^^""®^"" amplification primer for SEQ 238, SEQ 315 
aggtaagaat gagcaaaaag g 

<210> 411 21 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<400> amplification primer for SEQ 239, SEQ 316, SEQ 240, SEQ 317 

gcttgtgttt gttcaattc 

<210> 412 19 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> j 

<221> primer_iDind 

<222> 1. .18 ! 

<400> ^^^"""^ amplification primer for SEQ 241, SEQ 318, SEQ 242, SEQ 319 
cttgaaatac tcccagcc 

<210> 413 18 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer__bind 
<222> 1. ,19 
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<400> 413 

ccatgaactg agaactttg 

<210> 414 19 
<211> 18 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1, .18 i 

<400> f^^^^^f^ amplification primer for SEQ 250, SEQ 327 
ggtgacaggt aakgaaac 

<210> 415 ' 18 

<211> 21 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer„bind 
<222> 1. .21 

<400> J^^^"^^^^ amplification primer for SEQ 251, SEQ 328 
attcaggcac agaagtcata c 

<210> 416 21 
<211> 21 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. .21 

<400> g^"'"^ amplification primer for SEQ 252, SEQ 329, SEQ 253, SEQ 330 
agggcagcac aatgtagtaa g 

<210> 417 21 

<211> 18 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .18 

<400> J^^^^""^^"* amplification primer for SEQ 254, SEQ 331 
cctctttatc tcpaaacc 

<210> 418 ! 18 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. .19 

<400> fj^^^"^ amplification primer for SEQ 255, SEQ 332 
gaaaacaatc aagctctgg 

<210> 419 19 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> prime r_bind 
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<222> 1. .19 

<223> downstream amplification primer for SEQ 256, SEQ 333 
<400> 419 

cctttatatc cttggagtc 
<210> 420 ! 
<211> 21 
<212> DNA 
<213> Homo Sapiens 
<220> [ ^ 

<221> primerjbind 
<222> 1. .21 j 

<223> downstrieam amplification primer for SEQ 257, SEQ 334 

<400> 420 1 

tattacacgt tccaactctt c 

<210> 421 

<211> 20 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .20 

<223> downstream amplification primer for SEQ 258, SEQ 335 
<400> 421 

ctgtgtttaa gtgactgctg 2 0 

<210> 422 

<211> 21 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .21 

<223> downstream amplification primer for SEQ 259, SEQ 336 
<400> 422 

ttattgcccc acktgcttga g 

<210> 423 ' 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> i 

<221> primer_bind 

<222> 1. .19 

<223> downstream amplification primer for SEQ 260, SEQ 337 
<400> 423 

tcattcgtct ggctaggtc 

<210> 424 

<211> 21 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primerjbind 
<222> 1. ,21 

<223> downstream ait¥)lif ication primer for SEQ 261, SEQ 338 
<400> 424 

gaaacagact gaagcaagga c 21 

<210> 425 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> I 

<221> primer_Dind 

<222> 1. ,19 
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<223> potential microsequencing oligo for 4-14-107 .misl 

<400> 425 I 

acaaccacca aajtgcatac 

<210> 426 ' 

<211> 19 

<212> DNA J 

<213> Homo Sapiens 

<220> I 

<221> primer_bind 

<222> 1. .19 

<223> potential microsequencing oligo for 4-14-317 .misl 
<400> 426 

acatgcaagg tgggcaaga 
<210> 427 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-14-35. misl 
<400> 427 

aacacagaaa ccgctaaaa 
<210> 428 
<211> 23 
<212> DNA ; 
<213> Homo Sapiens 
<220> j 
<221> primer_bind 
<222> 1. .23 ; 

<223> microseiiuencing oligo for 4-20-149 .misl 

<400> 428 1 

tttttgctgt gtcttcaaag tga 

<210> 429 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-20-77. misl 
<400> 429 

acatgaagat tctgaaggg 

<210> 430 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .23 

<223> microse(3uencing oligo for 4-22-174 .misl 

<400> 430 i 

ggattgtgca gaagttgcct ttc 

<210> 431 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 ■ 

<223> potential microsequencing oligo for 4-22-176 .misl 
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<400> 431 

tgcagaagtt gcbtttcat 

<210> 432 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-26-60. misl 
<400> 432 

ggaaagtgca tcttaagac 

<210> 433 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 i 

<223> potenti^ microsequencing oligo for 4-26-72. misl 

<400> 433 i . 

ttaagacagt ta^caggcc 

<210> 434 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> [ 

<221> primer_bind 

<222> 1. .19 

<223> potential microsequencing oligo for 4-3-130. misl 
<400> 434 

gggcctaaaa cagtattct 

<210> 435 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-38-63. misl 
<400> 435 

agttataaga aaatcaggc 

<210> 436 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_:3ind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-38-83. misl 
<400> 436 

gaggctaaac tt:tttttt 

<210> 437 ' 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..19 

<223> potential microsequencing oligo for 4-4-152 .misl 
<400> 437 
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ttcccattgt tcctgactt 

<210> 438 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer„bind 
<222> 1..23 

<223> microseguencing oligo for 4-4-187. misl 
<400> 438 

tataaacaga aacatggatg agt 

<210> 439 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> ! 

<221> primer_)3ind 

<222> 1..19 I 

<223> potential microseguencing oligo for 4-4-288. misl 
<400> 439 

catcaactaa ttttcacaa 

<210> 440 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microseguencing oligo for 4-42-304 .misl 
<400> 440 

tttaaaacta tttatgtaa 

<210> 441 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer„bind 
<222> 1. .23 

<223> microseguencing oligo for 4-42-401. misl 
<400> 441 

taagaaagaa ttctgtgttc tgg 

<210> 442 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> I 

<221> primer „bind 

<222> 1. .19 i 

<223> potential microseguencing oligo for 4-43 -32 8. misl 
<400> 442 

ttctgtgttc tggccaaag 

<210> 443 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer.bind 
<222> 1. .23 

<223> microseguencing oligo for 4-43-70.misl 
<400> 443 

atcgcctcca ttattctcaa aaa 
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<210> 444 
<211> 23 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> prime r_bind 
<222> 1..23 

<223> microseouencing oligo for 4-50-209 .misl 
<400> 444 I 

atatagagtg tgfcatccctg aca 23 

<210> 445 ' 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> i 

<221> primer„bind 

<222> 1. .23 

<223> microsequencing oligo for 4-50-293 .misl 
<400> 445 

cctgagtccc agggggctga cag 23 
<210> 446 
<211> 23 
<212> DNA 

<213> Homo Sapiens 
<22C> 

<221> primer_bind 
<222> 1. .23 

<223> microsequencing oligo for 4-50-323 .misl 
<400> 446 

tttaaaacat tgatgaatct tta 23 

<210> 447 

<211> 23 

<212> DNA I 

<213> Homo Sapiens 

<220> 1 

<221> primer_bind 

<222> 1. .23 I 

<223> microsequencing oligo for 4-50-329 .misl 
<400> 447 ; 

acattgatga atctttatta eta 23 

<210> 448 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..19 

<223> potential microsequencing oligo for 4-50-330 .misl 
<400> 448 

gatgaatctt tattactac 

<210> 449 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .23 

<223> microsequencing oligo for 4-52-163 .misl 
<400> 449 ! 

gaacaggata ttbttaacta cca 23 
<210> 450 ' 





wo 99/32644 



PCT/IB98/02133 



196 



<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> [ 

<221> primerjbind 

<222> 1. .23 

<223> microsecjuencing oligo for 4-52-88. misl 
<400> 450 

tccatgtcat tabtattcaa aag 

<210> 451 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primerjbind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-53-258 .misl 
<400> 451 

aatcatgcag agagaatgc 

<210> 452 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. ,23 

<223> microsequencing oligo for 4-54-283 .misl 

<400> 452 t 

aagtagtttt tc'pcactttc tot 

<210> 453 ! 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> i 

<221> primer_bind 

<222> 1, .19 i 

<223> potential microsequencing oligo for 4-54-388 .misl 
<400> 453 

ctatcgtata catctttac 

<210> 454 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-55-70. misl 
<400> 454 

aagaacctag gttttaaaa 
<210> 455 
<211> 23 
<212> DNA 
<213> Homo Sapiens 
<220> [ 
<221> primer_pind 
<222> 1..23 j 

<223> microseiguencing oligo for 4-55-95. misl 

<400> 455 ! 

ctctctatcg tatacatctt tac 

<210> 456 ' 

<211> 23 
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<212> DNA 
<213> Homo Sapiens 
<220> I 
<221> priiner_bind 
<222> 1. .23 

<223> micros equencing oligo for 4-56-159 .misl 
<400> 456 

aagttttcct tctcttctgt aga 23 

<210> 457 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential micros equencing oligo for 4-56-213 .misl 
<400> 457 

ctcatgttca ctctggttc 19 

<210> 458 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> I 

<221> primer_bind 

<222> 1. .23 1 

<223> microseguencing oligo for 4-58-289 .misl 
<400> 458 1 

catacctgca gcctgctttt ggt 23 

<210> 459 ^ 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .23 

<223> micros equencing oligo for 4-58-318 .misl 
<400> 459 

tgactacttt acctgcaata ttt 23 

<210> 460 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .23 

<223> microsequencing oligo for 4-60-266 .misl 
<400> 460 

aacaggacca agacactgca tta 23 

<210> 461 

<211> 23 

<212> DNA 

<213> Homo Sajpiens 

<220> i 

<221> primer Jbind 

<222> 1..23 I 

<223> microsequencing oligo for 4-60-293 .misl 
<400> 461 

aagtttcagt atttcttagc aga 23 
<210> 462 
<211> 19 
<212> DNA 
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<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-84-241 .misl 
<400> 462 

aaaaaatagt gactgccac 

<210> 463 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-84-262 .misl 
<400> 463 

tgaataattc agttcttca 
<210> 464 • 
<211> 19 
<212> DNA 
<213> Homo Sapiens 
<220> : 
<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-86-206 .misl 
<40O> 464 

tcaaatcagg acacaccac 
<210> 465 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-86-309 .misl 
<400> 465 

tctaggcagg ccactttag 

<210> 466 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_iDind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-88-349 .misl 
<400> 466 

ctaaaagaca atattcagt 

<210> 467 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .23 

<223> microsequencing oligo for 4-89-87. misl 
<400> 467 

ttcttccctg aacgctggtt tea 

<210> 468 

<211> 19 

<212> DNA 

<213> Homo Sapiens 
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<220> 

<221> priiner_bind 
<222> 1,.19 

<223> potential microseguencing oligo for 99-123-184 .misl 
<400> 468 

cccagaacat tcaccagct 19 
<210> 469 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 1 
<221> priiner„bind 
<222> 1..19 j 

<223> potential mi cr ©sequencing oligo for 99-128-202 .misl 
<400> 469 

tctgtttctt agagaactg 19 

<210> 470 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microseguencing oligo for 99-128-275 .misl 
<400> 470 

ccctacctca catgtgtag 19 

<210> 471 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer__bind 
<222> 1. .19 

<223> potential microseguencing oligo for 99-128-313 .misl 
<400> 471 

tctctagaca gakatacat 19 

<210> 472 ' 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .23 I 

<223> microseguencing oligo for 99-128-60. misl 
<400> 472 ! 

cactgtgacc caggcgctag cgt 23 

<210> 473 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..19 

<223> potential microseguencing oligo for 99-12907-295 .misl 
<400> 473 

tatggcatta tatctccac 19 

<210> 474 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> I 
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<221> priiner_bind 
<222> 1. ,19 

<223> microseciuencing oligo for 99-130-58 .misl 
<400> 474 j 

caaaagagct tcaaaaata 19 
<210> 475 ' 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> j 
<221> primer_bind 
<222> 1. .19 I 

<223> potential micros equencing oligo for 99-134-362 .misl 
<400> 475 ■ 

acactcatgt tagttagat 19 

<210> 476 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> microsequencing oligo for 99-140-130 .misl 
<400> 476 

caaaagcagc tacagacca 19 

<210> 477 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> I 

<221> primer_bind 

<222> 1..19 I 

<223> microsefjuencing oligo for 99-1462-238 .misl 
<400> 477 : 

ttcaaggtta gtaactcat 19 

<210> 478 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 99-147-181 .misl 
<400> 478 

catgaaaaag agcatgata 19 

<210> 479 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_.bind 
<222> 1. .19 

<223> potential microsequencing oligo for 99-1474-156 .misl 
<400> 479 

tactcataag ttaaatatt 19 

<210> 480 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<22o> ; 

<221> primer_bind 
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<223> potintikl microsequencing oligo for 99-1474-359 .misl 

<400> 480 19 

aaaatcaaat tattgtacc 

<210> 481 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1 19 

<223> microsequencing oligo for 99-1479-158 .misl 

<400> 481 19 

aaaatccact tgtaatcgc 

<210> 482 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 

potential microsequencing oligo for. 99-1479-379 .misl 

<400> 482 19 

agctgtgtac tgaggtcag 

<210> 483 ' 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<22o> ; ^ 

<221> primer_bind 
<222> 1 19 * 

<223> potential microsequencing oligo for 99-148-129 .misl 

<400> 483 19 

tatctataca aataatttt 

<210> 484 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1 . 19 

<223> potential microsequencing oligo for 99-148-132 .misl 

<400> 484 19 

ctatacaaat aattttgaa 

<210> 485 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> i 

<221> primer__bind 

<222> 1 19 ■ 

<223> potentikl microsequencing oligo for 99-148-139 .misl 

<400> 485 I 19 

aataattttg aatttaata 

<210> 486 • 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> priraer_bind 
<222> 1. .19 
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<223> potentikl microsequencing oligo for 99-148-140 .misl 

<400> 486 ^5 

ataattttga atttaatac 

<210> 487 

<2li> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer^bind 
<222> 1 , . 19 

<223> potential microsequencing oligo for 99-148-182 .misl 
<400> 487 

tgttgatatg ggcaactgt 

<210> 488 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer^bind 
<222> 1 . . 19 

<223> potential microsequencing oligo for 99-148-366 .misl 

<400> 488 '[ ^5 

tgtcaaaggt ctctccctg 

<210> 489 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1 . . 19 

<223> potential microsequencing oligo for 99-148-76 .misl 
<400> 489 

agaatgcctt cctgaatta 

<210> 490 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1 , . 19 

<223> potential microsequencing oligo for 99-1480-290 .misl 

<400> 490 ^9 

ccatcttcac cacaacccc 

<210> 491 

<211> 19 

<212> DNA ^ 

<213> Homo Sapiens 

<220> \ 

<221> primer_bind 

<222>1..19- 

<223> potential microsequencing oligo for 99-1481-285 .misl 

<400> 491 ^5 

ataacctgtt ttgcttctc 

<210> 492 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 

<222> 1. .19 . ^ 

<223> potential microsequencing oligo for 99-1484-101 .misl 
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<400> 492 

agatcaaata taagcatgt 19 
<210> 493 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> priiner__bind 
<222> 1..19 

<223> microsequencing oligo for 99-1484-328 .raisl 
<400> 493 

acgtggtcat gaggagttt 19 
<210> 494 
<211> 19 
<212> DNA 

<213> Homo Sat)iens 
<220> ! 
<221> primer^bind 
<222> 1. .19 

<223> potential microsequencing oligo for 99-1485-251 .misl 
<400> 494 I . 

gccttgatat atgctccca 19 

<210> 495 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> priiner_bind 
<222> 1*.19 

<223> microsequencing oligo for 99-1490-381 .misl 
<400> 495 

cagtggaaat accatgtca 19 

<210> 496 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..19 ' 

<223> potential microsequencing oligo for 99-1493-280 .misl 
<400> 496 I 

gacagagtat tgttggagg 19 

<210> 497 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..19 

<223> potential microsequencing oligo for 99-151-94 .misl 
<400> 497 

agatcattga taaggaaat 19 
<210> 498 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1..19 

<223> microsequencing oligo for 99-211-291 .misl 
<400> 498 
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ttatatcaga ctgaccttc 19 

<210> 499 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 j 

<223> potential microsequencing oligo for 99-213-37 .misl 
<400> 499 I 

cttccggctg capgactgt 19 

<210> 500 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> priiner_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 99-221-442 .misl 
<400> 500 

tttgtagata tgcatggga 19 

<210> 501 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .23 

<223> microsequencing oligo for 99-222-109 .misl 
<400> 501 

caggtgagga gtgctggatt ggc 23 

<210> 502 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> I 

<221> primer_iDind 

<222> 1. .23 

<223> microseguencing oligo for 4-14-107 .rais2 
<400> 502 : 

ctatcaggca tttgcctggt tgc 23 

<210> 503 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer^bind 
<222> 1. .23 

<223> microsequencing oligo for 4-14-317 .mis2 
<400> 503 

tcatgacctg tgcccacctc ttt 23 

<210> 504 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .23 I 

<223> microseguencing oligo for 4-14-35, mis2 
<400> 504 I 

tctctgcaga cagcttctgc ctg 23 
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<210> 505 

<211> 19 

<212> DNA 

<213> Homo Sai>iens 

<220> I 

<221> priiner_bind 

<222> 1. .19 ; 

<223> potential micros equencing oligo for 4-20-149 .mis2 
<400> 505 

agcaggcaat aaaccaaga 19 

<210> 506 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential micros equencing oligo for 4-20-77. mis2 
<400> 506 

gtgttctcag acaacaaag 19 
<210> 507 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer„bind 
<222> 1..19 I 

<223> potential micros equencing oligo for 4-22-174 .mis2 
<400> 507 

aaattaacat ttttgaaca 19 

<210> 508 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential micros equencing oligo for 4-22-176 .mis2 
<400> 508 

acaaattaac atttttgaa 19 

<210> 509 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .23 

<223> microsequencing oligo for 4-26-60. mis2 
<400> 509 

aagtcgctcc tcggcctgct aac 23 

<210> 510 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-26-72. mis2 
<400> 510 

accctttaaa gtcgctcct 19 
<210> 511 
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<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> i 

<221> primer_^3ind 

<222> 1..23 i 

<223> microsequencing oligo for 4-3-130 -iQis2 
<400> 511 

agttaatacc aatttaagct tta 

<210> 512 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer _bind 

<222> 1. .19 . 
<223> potential microsequencing oligo for 4-38-63 .mis2 

<400> 512 

aaaaaaaaag tttagcctc 

<210> 513 

<211> 23 

<212>. DNA 

<213> Homo Sapiens 

<220> 

<221> primer_iDind 
<222> 1..23 1 

<223> microsequencing oligo for 4-3 8-83 .mis2 

<400> 513 f 

atattctcaa cagcattgcc aaa 

<210> 514 ' 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-4-152 .mis2 
<400> 514 

tgtttatata taggataac 

<210> 515 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-4-187. mis2 
<400> 515 

tttttttttt ttttttttt 
<210> 516 
<211> 19 
<212> DNA 
<213> Homo Sapiens 
<220> ! 
<221> primer_bind 
<222> 1. .19 ; 

<223> potential microsequencing oligo for 4-4-288. mis2 
<400> 516 

tgaaatcaaa acataggta 
<210> 517 
<211> 19 
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<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..19 

<223> potential microsequencing oligo 
<400> 517 

aaaaacccct gaaaataag 

<210> 518 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo 
<400> 518 

tctgtgggtt taaactttg 

<210> 519 

<211> 19 

<212> DNA 

<213> Homo Sai)iens 

<220> 

<221> primer„bind 
<222> 1. .19 i 

<223> potential microsequencing oligo 
<400> 519 

actggctctg tgggtttaa 

<210> 520 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo 
<400> 520 

ttgtgttgtg tcccatggt 

<210> 521 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 I 

<223> potential microsequencing oligo 

<400> 521 I 

cataaagcct tcagtttca 

<210> 522 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo 
<400>- 522 

tcaatgtttt aaactgtcc 
<210> 523 
<211> 19 
<212> DNA 



for 4-42-304. mis2 

19 



f or''4-42-401 .mis2 

19 



for 4-43-328. mis2 

19 



for 4-43-70. mis2 

19 



for 4-50-209. mis2 

19 



for 4-50-293 .mis2 

19 
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<213> Homo Sapiens 
<220> 

<221> priiner„bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-50-323 .mis2 
<400> 523 

atcgaaccct tttgtagta 19 

<210> 524 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> i 

<221> primer bind 

<222> 1. .19 I 

<223> potential microsequencing oligo for 4-50-329 .mis2 
<400> 524 i 

gcctaaatcg aaccctttt 19 

<210> 525 ' 

<211> 19 

<212> DNA 

<213> Homo Sai)iens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-50-330 .mis2 
<400> 525 

agcctaaatc gaacccttt 19 

<210> 526 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-52-163 .mis2 
<400> 526 

aatagatgtg taaaattct 19 

<210> 527 

<211> 19 

<212> DNA 

<213> Homo Satiens 

<220> I 

<221> primer bind 

<222> 1. .19 ! 

<223> potentikl microsequencing oligo for 4-52-88. mis2 
<400> 527 r 

caccttgtgt attttttaa 19 

<210> 528 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .23 

<223> microsequencing oligo for 4-53-258 .mis2 
<400> 528 

ttaggttaaa atttgagtga gaa 23 

<210> 529 

<211> 19 

<212> DNA 

<213> Homo Sapiens 
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<220> 

<221> priiner_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-54-283 .mis2 

<400> 529 I 

taagccatcg at|tgtatca 

<210> 530 

<211> 19 

<212> DNA , 

<213> Homo Sabiens 

<220> [ 

<221> primer_pind 

<222> 1. .19 \ 

<223> potential microsequencing oligo for 4-54-3 88 .mis2 
<400> 530 

gtcttggcgc tgcagcgtg 

<210> 531 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .23 

<223> microsequencing oligo for 4-55-70. mis2 
<400> 531 

taaagatgta tacgatagag agt 

<210> 532 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> i 

<221> primer^bind 

<222> 1..19 I 

<223> potentikl microsequencing oligo for 4-55-95. mis2 

<400> 532 i 

gtcttggcgc tgcagcgtg 

<210> 533 f 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..19 

<223> potential microsequencing oligo for 4-56-159 .mis2 
<400> 533 

ttgactgtaa catggagac 

<210> 534 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> priiner_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-56-213 .mis2 
<400> 534 

tatcaaactc ctctgaagg 
<210> 535 I 
<211> 19 i 
<212> DNA I 
<213> Homo Saj)iens 
<220> 
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<221> primerjbind 
<222> 1..19 I 

<223> potential microsequencing oligo for 4-58-289 .mis2 
<400> 535 i 

aggtaaagta gticacccct 19 

<210> 536 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primerjbind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-58-318 .mis2 
<400> 536 

gaagaaataa acttgcaaa 19 

<210> 537 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-60-266 .mis2 
<400> 537 

agaaatactg aaactttat 19 

<210> 538 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> i 

<221> primerjbind 

<222> 1. .19 

<223> potential microsequencing oligo for 4-60-293 .mis2 
<400> 538 

aggacttcct gctggcttc 19 

<210> 539 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..23 

<223> microsequencing oligo for 4-84-241 .mis2 
<400> 539 

ttctgaagaa ctgaattatt cac 23 

<210> 540 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 I 

<223> potential microsequencing oligo for 4-84-262 .mis2 
<400> 540 i 

tgagatcatg ttgctgctt 19 

<210> 541 ' 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
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<222> 1. .23 : 

<223> microseiquencing oligo for 4-86-206 .mis2 
<400> 541 ! 

aatgttaacg tgtagatgcc att 23 

<210> 542 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer.bind 
<222> 1. .19 

<223> potential microsequencing oligo for 4-86-309 ■inis2 
<400> 542 

gctctctggt tcctcactc 19 
<210> 543 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> prime r^bind 
<222> 1. .19 ; 

<223> potentiial microsequencing oligo for 4-88-349 .mis2 
<400> 543 j 

aagaacttgg aaaatctca 19 

<210> 544 

<211> 19 

<212> DNA ; 

<213> Homo Sapiens 

<220> ' 

<221> primer_bind 

<222> 1. .19 

<223> potential microsequencing oligo for 4-89-87. mis2 
<400> 544 

ttctcaacac aaaaactat 19 

<210> 545 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer„bind 
<222> 1. .19 

<223> potential microsequencing oligo for 99-123-184 .mis2 
<400> 545 

cccagcagaa ctcttggcc 19 

<210> 546 

<211> 19 

<212> DNA j 

<213> Homo Sat)iens 

<220> I 

<221> primer_fc>ind 

<222> 1. .19 j 

<223> potential microsequencing oligo for 99-128-202 .mis2 
<400> 546 I 

gtatgtatgt gt^tgtgtt 19 

<210> 547 ' 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 
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<223> potential microsequencing oligo for 99-128-275 .rais2 
<400> 547 

acatatatgc atacatttg 19 

<210> 548 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer^bind 
<222> 1. .19 

<223> potential microsequencing oligo for 99-128-313 .mis2 
<400> 548 

tctatgccaa atagaatct 19 

<210> 549 

<211> 19 

<212> DNA 

<213> Homo Sajpiens 

<220> i 

<221> primer_bind 

<222> 1. .19 I 

<223> potential microsequencing oligo for 99-128-60 .mis2 
<400> 549 

ggagtgtcac tgtaagagg 19 

<210> 550 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 

<222> 1..19 # 

<223> microsequencing oligo for 99-12907-295 .mis2 

<400> 550 

ttgtacatca ggtctgccc 19 

<210> 551 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .23 

<223> microsequencing oligo for 99-130-58 .mis2 
<400> 551 \ 

ctcgccatat gcacactcct gaa 23 

<210> 552 . ' 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> I 

<221> primer„bind 

<222> 1. .19 ; 

<223> microseijuencing oligo for 99-134-362 .mis2 
<400> 552 

tctttgtaat aggaataat 19 

<210> 553 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> microsequencing oligo for 99-140-130 .rais2 
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<400> 553 

catgctcaat tgtttacat 19 
<210> 554 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. ,19 

<223> potential micros equencing oligo for 99-1462-238 .mis2 
<400> 554 i 

ctgaagcaga ae^acagca 19 
<210> 555 ! 
<211> 23 
<212> DNA 

<213> Homo Sapiens 
<220> i ^ 

<221> primer Jbind 
<222> 1. .23 I 

<223> microsequencing oligo for 99-147-181 .mis2 
<400> 555 

tataaagatt taagtttttc ttt 23 
<210> 556 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. .19 

<223> microsequencing oligo for 99-1474-156 .mis2 
<400> 556 

ccatatttct tcttgttat 19 
<210> 557 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> I 
<221> primer Jsind 
<222> 1. .19 ; 

<223> potential microsequencing oligo for 99-1474-359 .mis2 
<400> 557 • 

catctgatat tagggaatt 19 

<210> 558 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer„bind 
<222> 1, .19 

<223> potential microsequencing oligo for 99-1479-158 .mis2 
<400> 558 

aatatacact ccaattagc 19 
<210> 559 
<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 99-1479-379 .mis2 
<400> 559 
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ctgtaccatg agctgcttc 19 

<210> 560 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer jDind 
<222> 1,-19 : 

<223> potential microsequencing oligo for 99-148-129 .inis2 
<400> 560 

cagccctatg tattaaatt 19 

<210> 561 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 99-148-132 .mis2 
<400> 561 

ttgcagccct atgtattaa 19 

<210> 562 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 99-148-139 .mis2 
<400> 562 

ccttgttttg cagccctat 19 

<210> 563 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 99-148-140 .mis2 
<400> 563 

accttgtttt gcagcccta 19 

<210> 564 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .23 

<223> microsequencing oligo for 99-148-182 .rais2 
<400> 564 

gaatgctttg ggaccatcca aca 23 

<210> 565 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 99-148-366 .mis2 
<400> 565 

gaggcggcag ccgtgagca 19 
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<210> 566 

<211> 19 

<212> DNA 

<213> Homo Sajpiens 

<220> 

<221> primer Jbind 
<222> 1. .19 i 

<223> potential micros equencing oligo for 99-148-76 .mis2 
<400> 566 

tatgaagcca tcaagagta 19 

<210> 567 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> microsequencing oligo for 99-1480-290 .mis2 
<400> 567 

aaaaggatca gtggttgcc 19 

<210> 568 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 i 

<223> raicroselquencing oligo for 99-1481-285 .iais2 
<400> 568 I 

taccatcttg aggttagag 19 

<210> 569 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 99-1484-101 .rais2 
<400> 569 

agattttaag gagaggagt 19 

<210> 570 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 99-1484-328 .mis2 
<400> 570 

tctgaaaact gaatccctt 19 

<210> 571 i 

<211> 19 

<212> DNA J 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 i 

<223> microsequencing oligo for 99-1485-251 .mis2 
<400> 571 

aggggacatt cttggttct 19 
<210> 572 
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<211> 19 
<212> DNA 

<213> Homo Sapiens 
<220> 

<221> priiner_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 99-1490-381 .inis2 
<400> 572 

agatgcacag tagcgtacc 

<210> 573 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1. .19 

<223> microsequencing oligo for 99-1493-280 .inis2 
<400> 573 

acaagcagcc aaaccccat 

<210> 574 

<211> 23 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> priiner_bind 
<222> 1..23 

<223> microsequencing oligo for 99-151-94 -mis2 

<400> 574 ! 

atatagattt tgaaatttta gaa 

<210> 575 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> primer_bind 
<222> 1..19 

<223> potential microsequencing oligo for 99-211-291 .mis2 
<400> 575 

cattgacctg ttgaaaaca 

<210> 576 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> priiner_bind 
<222> 1..19 I 

<223> potential microsequencing oligo for 99-213-37 .inis2 

<400> 576 i 

tcagacactg ga^tcctcc 

<210> 577 

<211> 19 

<212> DNA 

<213> Homo Sapiens 

<220> 

<221> priitier_bind 
<222> 1..19 

<223> potential microsequencing oligo for 99-221-442 .mis2 
<400> 577 

gtctggctag gtcatggaa 
<210> 578 
<211> 19 
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<212> DNA 

<213> Homo Sapiens 
<220> 

<221> priiner_bind 
<222> 1. .19 

<223> potential microsequencing oligo for 99-222-109 .rais2 
<400> 578 

ctgaagaaat tcatatcgt 19 



